semEP

1. About

semEP is an edge partitioning approach that combines a data mining framework for link prediction, semantic knowledge (similarities) from ontologies, and an algorithmic approach to partition the edges of a heterogeneous graph.

For more information about semEP see:

Guillermo Palma, Maria-Esther Vidal and Louiqa Raschid. Drug-Target Interaction Prediction Using Semantic Similarity and Edge Partitioning. Proceedings of the 12th International Semantic Web Conference (ISWC 2013). Italy. 2014. (PDF)

2. Content

AUTHORS: list of contributors of the semEP project.
doc: Documentation about semEP.
LICENSE: GPL version 2.
Makefile: builds semEP.
README.md: this file.
src: source code.
test: datasets to test semEP
- test/yamanishi: link prediction dataset proposed by Yamanishi et al. [1].
- test/pawels: drug side-effect dataset proposed by Pauwels et al. [2].
VERSION: software version.

[1] K. Bleakley and Y. Yamanishi. Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics, 25(18):2397-2403, 2009.

[2] E. Pauwels, V. Stoven, and Y. Yamanishi. Predicting drug side-effect profiles: a chemical fragment-based approach. BMC Bioinformatics, 12:169, 2011.

3. License

GNU GENERAL PUBLIC LICENSE Version 2.

4. Requirements

GNU Colecction Compiler (GCC) or Clang.
GNU make (make).

semEP has been tested on FreeBSD, GNU/Linux, and OS X.

5. Installation

Clean and generate necessary files:

$>make clean

$>make

6. Usage

semEP has 7 command line arguments and all them are mandatory. Here, mandatory means that without specifying this argument, the program won't work.

semEP command synopsis:

semEP <-l left threshold> <-r right threshold> <left matrix> <left vertices> <right matrix> <right vertices> <bipartite graph>

where:

bipartite grap: file with the bipartite graph (BPG) to study.
left threshold: threshold of similarity between the left side vertices of the BPG.
right threshold: threshold of similarity between the right side vertices of the BPG.
left matrix: similarity matrix file with the similarities between the left side vertices of the BPG.
left vertices: file with the list of vertices of the left side of the BPG. The order of the vertices corresponds to the left matrix.
right matrix: similarity matrix file with the similarities between the right side vertices of the BPG.
right vertices: file with the list of vertices of the right side of the BPG. The order of the vertices corresponds to the right matrix.

7. Running some samples:

Compute the partitions in a drug-target bipartite graph and get predicted links on nuclear receptor dataset:

./semEP -l 0.3061 -r 0.1614 test/yamanishi/nr/nr_matrix_drugs.txt test/yamanishi/nr/nr_drugs.txt test/yamanishi/nr/nr_matrix_targets.txt test/yamanishi/nr/nr_targets.txt test/yamanishi/nr/nr_drug-target_graph.txt

Compute the partitions in a drug-target bipartite graph and get predicted links on enzyme dataset:

./semEP -l 0.2174 -r 0.0198 test/yamanishi/e/e_matrix_drugs.txt test/yamanishi/e/e_drugs.txt test/yamanishi/e/e_matrix_targets.txt test/yamanishi/e/e_targets.txt test/yamanishi/e/e_drug-target_graph.txt

8 semEP input

8.1. File format of the bipartite graph

A bipartite graph is a graph whose vertices can be divided into two disjoint sets L and R such that every edge connects a vertex in L to one in R. We can denote a bipartite graph as G=(L,R,E), with E denoting the edges of the graph. We call the set L as the left side of the bipartite graph, and the set R as the right side of the bipartite graph.

The first line of the bipartite graph file, contains the number of edges. The following lines in the file shows the edges data, where each line corresponds to a edge and contains the node of the left side, the node of the right side, and the cost of the edge.

[number of edges n]
[right-node edge-1][TAB][lef-tnode edge-1][TAB][edge-cost-1]
..
..
..
[right-node edge-n][TAB][left-node edge-n][TAB][edge-cost-n]

8.2. File format of the similarity matrix

Files with the matrices must contain the similarities between the vertices of the left side, or the right side. The file format is as follows:

[number of rows and columns]
[sim vertex-1 vertex-1][SPC]...[SPC][sim vertex-1 vertex-n]
..
..
..
[sim vertex-n vertex-1][SPC]...[SPC][sim vertex-n vertex-n]

8.3. File format of the list vertices

The file format is as follows:

[number of vertices n]
[vertex 1]
..
..
..
[vertex n]

Where [vertex x] is the identifier of vertex in the left side or right side of the bipartite graph. The order of each vertex correspond to the position of the vertex in the similarity matrix.

9. semEP ouput

semEP produces two outputs:

a directory with suffix -Clusters that contains the clusters with the edge partitioning of the bipartite graph. Each cluster correspond to a file on the directory;
a file with suffix -Predictions.txt that contains the semEP predictions for each cluster and the probability of each prediction.

10. Contact

I hope you find semEP an useful tool. Please, let me know any comment, problem, bug, or suggestion.

Guillermo Palma

palma at l3s dot de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc

doc

src

src

test

test

.gitignore

.gitignore

AUTHORS

AUTHORS

LICENSE

LICENSE

Makefile

Makefile

README.md

README.md

VERSION

VERSION

Repository files navigation

semEP

1. About

2. Content

3. License

4. Requirements

5. Installation

6. Usage

7. Running some samples:

8 semEP input

8.1. File format of the bipartite graph

8.2. File format of the similarity matrix

8.3. File format of the list vertices

9. semEP ouput

10. Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
doc		doc
src		src
test		test
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION

License

gpalma/semep

Folders and files

Latest commit

History

Repository files navigation

semEP

1. About

2. Content

3. License

4. Requirements

5. Installation

6. Usage

7. Running some samples:

8 semEP input

8.1. File format of the bipartite graph

8.2. File format of the similarity matrix

8.3. File format of the list vertices

9. semEP ouput

10. Contact

About

Resources

License

Stars

Watchers

Forks

Languages