public class Spot extends Object implements Serializable
Modifier and Type | Class and Description |
---|---|
static class |
Spot.Parser
A record parser for tsv encoded spots.
|
Modifier and Type | Field and Description |
---|---|
protected static int |
collectionSize |
protected List<Entity> |
entities |
protected int |
freq |
protected double |
idf |
protected int |
link |
protected double |
linkProbability |
protected String |
mention |
Constructor and Description |
---|
Spot(String mention)
Builds a spot from a textual mention
|
Spot(String spot,
List<Entity> entities,
int link,
int freq)
Creates the spot, and associates the entities that can be referred by the
mention, how many times the mention occurs as an anchor-text in Wikipedia
and how many articles that contain the mention as anchor or simple text
(i.e.
|
Modifier and Type | Method and Description |
---|---|
Spot |
clone()
Returns a copy of this object
|
boolean |
equals(Object obj) |
static Spot |
fromByteArray(String text,
byte[] data)
Decodes a Spot from a byte representation, if the given text match the
spot text encoded in the byte array.
|
static Spot |
fromTsvLine(String text)
Decodes a tab separated representation of a spot.
|
static int |
getCollectionSize()
Get the number of entities in the collection
|
List<Entity> |
getEntities()
Return the list of entities with the current mention
|
double |
getEntityCommonness(Entity e)
Computes the entity commonness for a given entity and this spot, i.e.,
P(e|s) , the probability for an entity to be associated with
this spot. |
int |
getFrequency()
Returns the document frequency of this spot, i.e., how many wikipedia
articles contain the spot as simple text or anchor text.
|
double |
getIdf()
Returns the inverse document frequency of this mention;
|
int |
getLink()
Return how many times the mention occurs in the collection as a link to
an entity
|
double |
getLinkProbability()
Return the probability for the current mention to be a link
|
String |
getMention()
Returns the text of this mention;
|
int |
hashCode() |
static void |
setCollectionSize(int collectionSize)
Set the number of entities in the collection
|
void |
setEntities(List<Entity> entities)
Set the list of entities with the current mention
|
void |
setFrequency(int freq)
Set How many times the mention occurs in the collection.
|
void |
setIdf(double idf)
Set the inverse document frequency of the mention in the collection.
|
void |
setLink(int link)
Set how many times the mention occurs in the collection as a link to an
entity
|
void |
setLinkProbability(double probability)
Set the probability for this mention to be a link;
|
void |
setMention(String mention)
Set the text of the mention
|
byte[] |
toByteArray()
Encodes this spot in a array of byte, the encode consists in:
1 byte, containing the length of the mention (it is assumed that the
length
n of the mention is less than 256)
n bytes, containing the mention encoded in ascii
4 bytes, containing the frequency of the mention as link
4 bytes, containing the document frequency of the mention
2 x 4 x m bytes, where m is the number
of entities associated with the mention, containing for each entity its
unique id and its frequency (number of anchors with this text that link
to the entity); |
String |
toString() |
String |
toTsv()
Returns a tab separated version of the spot in a string.
|
protected void |
updateIdf()
Update the IDF using the freq and collectionSize fields
|
protected void |
updateLinkProbability()
Update the link probability using the link and freq fields
|
protected String mention
protected int freq
protected int link
protected double linkProbability
protected double idf
protected static int collectionSize
public Spot(String mention)
spot
- The string representing the textual mentionpublic Spot(String spot, List<Entity> entities, int link, int freq)
spot
- The string representing the spot;entities
- The list of entities that can be refered by the spot;link
- How many times the spot occurs in wikipedia as anchor text;freq
- How many articles that contain the text mention as anchor or
simple text (i.e. document frequency of the spot).public int getFrequency()
public void setFrequency(int freq)
freq
- How many times the mention occurs in the collectionpublic double getIdf()
public void setIdf(double idf)
idf
- The inverse document frequency to set for this mentionpublic String getMention()
public int getLink()
public void setLink(int link)
link
- Times the mention occurs in the collection as a link to an
entitypublic List<Entity> getEntities()
public void setEntities(List<Entity> entities)
entities
- The list of entities with the current mentionpublic double getLinkProbability()
public void setLinkProbability(double probability)
probability
- The probability of this mention to be a linkpublic void setMention(String mention)
text
- The text of the mention;protected void updateLinkProbability()
protected void updateIdf()
public static int getCollectionSize()
public static void setCollectionSize(int collectionSize)
collectionSize
- The number of entities of the collectionpublic double getEntityCommonness(Entity e)
P(e|s)
, the probability for an entity to be associated with
this spot. It is computed dividing the number of times that the spot
links to the entity by the frequency of spot as anchor.e
- The entity for which to compare the commmonnesspublic byte[] toByteArray()
n
of the mention is less than 256)n
bytes, containing the mention encoded in ascii
2 x 4 x m
bytes, where m
is the number
of entities associated with the mention, containing for each entity its
unique id and its frequency (number of anchors with this text that link
to the entity);public static Spot fromByteArray(String text, byte[] data)
text
- - the spot text to decodedata
- - the binary rep for the spot text
matches the text
of the spot encoded in data, otherwise nulltoByteArray
public String toTsv()
public static Spot fromTsvLine(String text)
text
- toTsv
Copyright © 2013. All rights reserved.