Skip to content

Siddharthpratapsingh/Hindi-search-engine-Crawled-data-

Repository files navigation

Siddharth-Singh

#Well these are the WebCrawlers for various Websites to equip data. #The programming has been done in Python and all the Modules are made in Python. #RSS.py is a FeedParser programme which crawles th Amarujala Website through its RSS feeds. #The main database storage is done in MongoDB and is connected to Python via PyMongo.

#samacharfeeds.py is a feedparser made in python which crawles the data of samachar-jagat website #tfidf.py is a small program written in python for finding term frequency (machine learning). #q.py is a small program for finding the summarized text of the corpus.It uses sumPY. #Word2Vec is a module made by google, the programming is done in python and is a part of machine learning.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published