GitHub - msfowler/WebCrawler: A web crawler written in C++

Web Crawler is a program that downloads webpages by following all the links from a given base URL recursively and indexes the pages by keyword.

TO BUILD: - Check out all the files - Type "Make"

TO RUN:

-The binary is in WEBCRAWLER/bin
-USAGE: crawler <start-url> <output-file> <stopword-file>
- <start-url> is the base URL. Be careful what you give it if you
don't want to start downloading large amounts of websites. 
- <output-file> is where you want to dump the output
- <stopword-file> is a text file containing words to skip when
  indexing such as "a", or "the"

A description of the data structures is contained in docs.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
docs		docs
inc		inc
obj		obj
src		src
test		test
utils		utils
README.md		README.md
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

docs

docs

inc

inc

obj

obj

src

src

test

test

utils

utils

README.md

README.md

makefile

makefile

Repository files navigation

About

Releases

Packages

Languages

msfowler/WebCrawler

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages