public class XLink extends Object
Constructor and Description |
---|
XLink() |
Modifier and Type | Method and Description |
---|---|
static Double |
calculateCosineSimilarity(HashMap<String,Double> firstFeatures,
HashMap<String,Double> secondFeatures)
calculate the cosine similarity between feature vectors of two clusters
The feature vector is represented as HashMap
|
static Double |
calculateNorm(HashMap<String,Double> feature)
calculate the norm of one feature vector
|
static int |
computeLevenshteinDistance(CharSequence str1,
CharSequence str2)
Finds and returns the lexicographical distance between two words.
|
static double |
computeStoilosSimilarity(String st1,
String st2)
Computes the similarity of two strings based on metric proposed in the
paper "A String Metric For Ontology Alignment", published in ISWC 2005.
|
void |
connect()
Infer how the identified entities are connected (only direct associations
are considered, i.e.
|
void |
connect(String category)
Infer how the identified entities are connected (only direct associations
are considered, i.e.
|
void |
disambiguate(ArrayList<Category> categories)
Tries to disambiguate the named entities based on their semantic category.
|
void |
enrich()
Enriches the retrieved entities with semantic information.
|
void |
enrich(Category c,
PropertiesType type)
Enriches the retrieved entities of a specific category with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrich(Category c,
PropertiesType type,
String languageCode)
Enriches the retrieved entities of a specific category with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrich(Entity e,
PropertiesType type)
Enriches the retrieved entity with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrich(Entity e,
PropertiesType type,
String languageCode)
Enriches the retrieved entity with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrich(Entity e,
String query,
String endpoint)
Enriches the retrieved entities e with semantic information using query
|
void |
enrich(PropertiesType type)
Enriches the retrieved entities with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrich(PropertiesType type,
String languageCode)
Enriches the retrieved entities with semantic information of PropertiesType
type in language languageISO (optional)
|
void |
enrichWithTopURIs()
Enriches the retrieved entities with semantic information, coming from
their top ranked URIs.
|
static HashMap<String,Double> |
findFreqs(String text)
Finds and returns the frequencies of terms in a text.
|
void |
findRelevanceScores(ArrayList<Category> categories)
Finds the relevance scores between the semantics of entities and the sentences content.
|
void |
findSnippets()
Finds the corresponding document snippets for the retrieved entities.
|
void |
findTopEntities()
Finds the top ranked entity of a document.
|
ArrayList<Triple> |
getAssociations()
Returns the associating triples of the entities.
|
String |
getDocumentContent()
Returns the input document content.
|
String |
getDocumentPath()
Returns the input document path.
|
EntityMiningComponent |
getEmc()
Returns the entity mining component.
|
ArrayList<Entity> |
getEntities()
Returns a list with the detected entities.
|
Set<Entity> |
getEntitiesWithName(String label)
Returns the entities of all categories that has as name attribute value the "label".
|
EntityMiningComponent |
getEntityMiningComponent()
Returns the entity mining component.
|
ArrayList<Entity> |
getTopEntities()
Returns a list with the top-K detected entities.
|
void |
link()
Tries to match the detected entities with URIs from the underlying
knowledge bases (i.e.
|
void |
link(Category c)
Tries to match the detected entities, of a specific category c, with URIs from the underlying
knowledge bases (i.e.
|
void |
link(Entity e)
Tries to match the entity e, with URIs from the underlying knowledge
bases (i.e.
|
void |
rankEntities(double f,
double x)
Ranks the retrieved entities based on the following formula
rank = ( # of entity appearances / # of total entities appearances ) * X% +
( # of entity appearances in first f% of text /
# of total entities appearances in first f% of text ) * (1-X)%.
|
void |
rankURIs()
Ranks the retrieved URIs based on their relevance.
|
void |
retrieveAllProperties() |
void |
retrieveEntities(String content,
HashSet<String> acceptedCategoryNames)
Performs entity mining to the given content and according to the given
categories.
|
void |
retrieveEntities(TextExtractor extractor,
HashSet<String> acceptedCategoryNames)
Performs entity mining with the given text extractor and according to the
given categories.
|
void |
retrieveIncomingProperties() |
void |
retrieveOutcomingProperties() |
ArrayList<Triple> |
retrieveTriples(String templateQueryFile,
String endpoint,
String uri,
String languageCode)
Retrieves the triples of an entity's given URI.
|
void |
setAssociations(ArrayList<Triple> associations)
Sets the associating triples.
|
void |
setDocumentContent(String document)
Sets/changes the input document content.
|
void |
setDocumentPath(String documentPath)
Sets/changes the input document path.
|
void |
setEmc(EntityMiningComponent emc)
Sets/changes the entity mining component.
|
void |
setEntities(ArrayList<Entity> entities)
Sets/changes the detected entities.
|
void |
setEntityMiningComponent(EntityMiningComponent emc)
Sets/changes the entity mining component.
|
void |
setTopEntities(ArrayList<Entity> topEntities)
Sets/changes the detected entities.
|
public void setEntityMiningComponent(EntityMiningComponent emc)
emc
- The entity mining component.public EntityMiningComponent getEntityMiningComponent()
public void retrieveEntities(TextExtractor extractor, HashSet<String> acceptedCategoryNames) throws FalseFileTypeException
extractor
- The text extractor that will be used for getting the
text in which we will perform entity mining.acceptedCategoryNames
- The names of the categories for which we
want to detect entities.FalseFileTypeException
public void retrieveEntities(String content, HashSet<String> acceptedCategoryNames)
content
- The text in which we want to perform entity mining.acceptedCategoryNames
- The names of the categories for which we
want to detect entities.public void link() throws CategoryNotFoundException, IOException
CategoryNotFoundException
IOException
public void link(Category c) throws CategoryNotFoundException, IOException
c
- The category which entities we want to linkCategoryNotFoundException
IOException
public void link(Entity e) throws CategoryNotFoundException, IOException
e
- The category which entities we want to linkCategoryNotFoundException
IOException
public void retrieveIncomingProperties()
public void retrieveOutcomingProperties()
public void retrieveAllProperties()
public void connect(String category) throws CategoryNotFoundException, IOException
category
- The category which entities we want to connect.CategoryNotFoundException
IOException
public void connect() throws CategoryNotFoundException, IOException
CategoryNotFoundException
IOException
public void enrich() throws CategoryNotFoundException, IOException
CategoryNotFoundException
IOException
public void enrich(Entity e, PropertiesType type, String languageCode) throws CategoryNotFoundException, IOException
e
- The identified entity we want to enrich.type
- The properties type (e.g. incoming, outgoing etc)languageCode
- The language of retrieved properties (optional)CategoryNotFoundException
IOException
public void enrich(PropertiesType type, String languageCode) throws CategoryNotFoundException, IOException
type
- The properties type (e.g. incoming, outgoing etc)languageCode
- The language of retrieved properties (optional)CategoryNotFoundException
IOException
public void enrich(Category c, PropertiesType type, String languageCode) throws CategoryNotFoundException, IOException
c
- The category which entities we want to enrich.type
- The properties type (e.g. incoming, outgoing etc)languageCode
- The language of retrieved properties (optional)CategoryNotFoundException
IOException
public void enrich(Entity e, PropertiesType type) throws CategoryNotFoundException, IOException, NoProperSPARQLTemplateQuery
e
- The identified entity we want to enrich.type
- The properties type (e.g. incoming, outgoing etc)CategoryNotFoundException
IOException
NoProperSPARQLTemplateQuery
public void enrichWithTopURIs() throws CategoryNotFoundException, IOException, NoProperSPARQLTemplateQuery
public void enrich(PropertiesType type) throws CategoryNotFoundException, IOException, NoProperSPARQLTemplateQuery
type
- The properties type (e.g. incoming, outgoing etc)CategoryNotFoundException
IOException
NoProperSPARQLTemplateQuery
public void enrich(Category c, PropertiesType type) throws CategoryNotFoundException, IOException, NoProperSPARQLTemplateQuery
c
- The category which entities we want to enrich.type
- The properties type (e.g. incoming, outgoing etc)CategoryNotFoundException
IOException
NoProperSPARQLTemplateQuery
public void enrich(Entity e, String query, String endpoint) throws CategoryNotFoundException, IOException
e
- The entity we want to enrich.query
- The SPARQL query we use to enrich entity e.endpoint
- The endpointCategoryNotFoundException
IOException
public EntityMiningComponent getEmc()
public void setEmc(EntityMiningComponent emc)
emc
- The entity mining component.public ArrayList<Entity> getEntities()
public ArrayList<Entity> getTopEntities()
public void setEntities(ArrayList<Entity> entities)
entities
- The entities.public void setTopEntities(ArrayList<Entity> topEntities)
topEntities
- The entities.public void setDocumentContent(String document)
document
- The input documentpublic String getDocumentContent()
public void setDocumentPath(String documentPath)
documentPath
- The input documentpublic String getDocumentPath()
public static int computeLevenshteinDistance(CharSequence str1, CharSequence str2)
str1
- The first word.str2
- The second word.public ArrayList<Triple> retrieveTriples(String templateQueryFile, String endpoint, String uri, String languageCode) throws IOException
templateQueryFile
- The template query for retrieving triples.endpoint
- The endpoint of a knowledge baseuri
- An entity's URI.languageCode
- The ISO code of a language.IOException
public ArrayList<Triple> getAssociations()
public void setAssociations(ArrayList<Triple> associations)
associations
- The associating triples.public void findSnippets()
public static Double calculateCosineSimilarity(HashMap<String,Double> firstFeatures, HashMap<String,Double> secondFeatures)
firstFeatures
- The feature vector of the first clustersecondFeatures
- The feature vector of the second clusterpublic static Double calculateNorm(HashMap<String,Double> feature)
feature
- of one clusterpublic static HashMap<String,Double> findFreqs(String text)
text
- The text whose terms frequencies we want to find.public Set<Entity> getEntitiesWithName(String label)
label
- The value of name attribute of the entities that we want to return.public void findRelevanceScores(ArrayList<Category> categories) throws IOException
categories
- The list of supported categories.IOException
public void disambiguate(ArrayList<Category> categories) throws IOException
categories
- The list of supported categoriesIOException
public void rankEntities(double f, double x)
f
- The percentage of document we want to check.x
- The weight we give to the 1st part of sum and 1-x the weight
we give to the 2nd part of sum.public void rankURIs() throws IOException
IOException
public void findTopEntities()
public static double computeStoilosSimilarity(String st1, String st2)
st1
- The 1st string.st2
- The 2nd string.Copyright © 2014. All rights reserved.