Skip to content

gnkitaa/tdminer-cpp

 
 

Repository files navigation

#TDMiner C++: Frequent episode mining with inter-event gap constraints.

Usage: tdminer [ht:i:m:sxcdz:o:f:] eventsfile outcomefile [min_supp]

 eventsfile	    file, that contains the event stream
 outcomefile	    file to write the outcome
 min_supp	    support threshold (default 0.0100)

Basic options:
 -h			Gives this help display.
 -t <num>		Specifies the number of pthreads (default 1).
 -i <intervalfile>	Specifies the interval file (default ivl.txt).
			See example ivl.txt for format
 -m <num>		Specifies the maximum level up to which mining is done.
			(default -1 i.e. no limit). Level corresponds to episode size.
 -s			Specifies minimum support as a ratio of #occurrences to total time span.
			By default minimum support is the ratio of #occurrences
			to total number of events in the data.
 -x			Turn off pre-count pruning heuristics.
 -c			Takes the first column as customer id
			and prevents patterns from crossing over across customers..
 -d			Enables duration of events. Reads the last column as end time of an event
 -z <num>		Set the list size for storing time stamps.
 -o <epsfile>		Counts the episodes specified in <epsfile>.
			The result of counting is stored in <outcomefile>.
 -f <occurfile>		Specifies the file into which episode occurrences are written.
			This works only in conjunction with -o
			The occurrences are output as <episode_index> : <list of time stamps>.

Mining Example:
$ tdminer sample_stream.csv test_episodes.txt 0.01

Counting example:
$ tdminer -o eps.txt -f occurrences.txt sample_stream.csv episodes-out.txt
"eps.txt" contains the episodes to count. Note currently only episode of 
the same size are supported.

Dependencies: PThreads library

Releases

No releases published

Packages

No packages published

Languages

  • C 56.1%
  • C++ 43.5%
  • Makefile 0.4%