trekhleb/links-detector

#artificialintelligence 

The cool part about this approach is that we have the freedom to generate training examples for different fonts, ligatures, text colors, background colors. This is very useful if we want to avoid the model overfitting during the training (so that the model could generalize well to unseen real-world examples instead of failing once the background shade is changed for a bit). It is also possible to generate a variety of link types like http://, http://, ftp://, tcp:// etc. Otherwise, it might be hard to find enough real-world examples of this kind of links for training. Another benefit of this approach is that we could generate as many training examples as we want. We're not limited to the number of pages of the printed book we've found for the dataset.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found