-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration files
The different programs (with the exception of EWC3
) receive, as an argument, an XML file containing the different sets of parameters to use with each algorithm. In this page, we describe how to configure an algorithm.
An algorithm is defined using the algorithm
tag, and is composed of two fields: the name field, which just contains the identifier of the algorithm, and the parameters of the algorithm (if any). A partial example is shown next:
<algorithms>
<algorithm>
<name>QLJM</name>
<params>
...
</params>
</algorithm>
</algorithms>
The name allows the program to know which algorithm you want to configure. Depending on the algorithm, there are some fixed values in the code. We show a relation of them here:
Algorithm | Basic | Without term discrimination | Without length normalization |
---|---|---|---|
Random | Random |
||
Popularity | Popularity |
||
Most Common Neighbors | MCN |
||
Jaccard | Jaccard |
||
Adamic-Adar | Adamic |
||
Cosine similarity | Cosine |
||
BIR | BIR |
||
BM25 | BM25 |
BM25 No Term Discrimination |
BM25 No Length Normalization |
Extreme BM25 | EBM25 |
EBM25 No Term Discrimination |
EBM25 No Length Normalization |
Pivoted normalization VSM | Pivoted VSM |
Pivoted VSM No Term Discrimination |
Pivoted VSM No Length Normalization |
Query Likelihood Dirichlet | QLD |
QLD No Term Discrimination |
QLD No Length Normalization |
Query Likelihood Jelinek-Mercer | QLJM |
QLJM No Term Discrimination |
QLJM No Length Normalization |
Query Likelihood Laplace | QLL |
||
PL2 | PL2 |
||
DLH | DLH |
||
DPH | DPH |
||
DFRee | DFRee |
||
DFReeKLIM | DFReeKLIM |
The parameters, represented with the params
tag, indicate the parameter combinations which have to be executed in the program. The parameters element must contain the different parameters, represented by the param
tag.
Each individual parameter is defined by three fields: its name (field name
), its type (field type
), and the set of values it can take (field values
). Next, we can see an example:
<params>
<param>
<name>lambda</name>
<type>Double</type>
<values>
...
</values>
</param>
<param>
...
</param>
</params>
The name of the parameter depends on the algorithm. We include in the repository two files named conf\fullgrid-dir.xml
and conf\fullgrid-undir.xml
which contain a configuration example for each algorithm (for directed/undirected graphs), so the names of the parameters are easy to obtain.
The type determines which variable type will be used to define the parameters. We provide the following types:
Type | Description |
---|---|
Integer | The values will be taken as int values |
Long | The values will be taken as long values |
Double | The values will be taken as double values |
Boolean | The values will be taken as boolean values. If the set value is "true" , it will be taken as the true boolean value. Otherwise, it will be taken as false . |
String | The values will be taken as they are (as strings) |
LinkOrientation | The values represent the selection of edges for a user. It can take three values: IN for the incoming neighborhood, OUT for the outgoing neighborhood and UND for the union of both. |
Finally, for the values, there are two possible ways to describe them: indicating the individual values, as it is shown in the following example:
<values>
<value>0.1</value>
<value>0.2</value>
...
<value>1.0</value>
</values>
or using ranges, as it is shown next:
<values>
<range>
<start>0.1</start>
<end>1.0</end>
<step>0.1</step>
</range>
</values>
where start
indicates the first value, end
the last one (both included), and step
indicates the difference between the values in the range.
Both ranges and individual values can be combined at will, even in the same parameter. For example:
<values>
<value>1.0</value>
<range>
<start>0.1</start>
<end>0.999</end>
<step>0.1</step>
</range>
</values>
Next, we include a table including the different parameter configurations we have selected for the experiments in our paper.
We include these configurations in the conf
folder. The notation followed is the same as the one indicated in the previous
formulas. In addition, similarly to [1], we choose, for the directed graphs, different orientations for the target and candidate
users' neighborhoods (the incoming, outgoing or undireted neighborhood of the users). In BM25 and EBM25 we also select
a different neighborhood for the length.
Algorithm (and variants) | Parameters |
---|---|
BM25 |
|
EBM25 | |
QLD | |
QLJM | |
QLL | |
PL2 | |
Pivoted normalization |