In SHARES, a link is an instance of lexical repetition between a pair of sentences, one from text A and one from text B.  In the example below there are 3 links between the two sentences, on the words Indonesia, economic and growthStopwords are excluded from linking.  If the stemmed corpus is selected, then links are also formed between words with the same base (e.g. economic would link with economy).

An individual token in a sentence is only allowed to link once with another sentence. In the example below, if the word economic had appeared twice in sentence B, the single economic in sentence A would only link with the first occurrence of economic in sentence B. However, if economic had appeared twice in  each sentence then 2 links would be formed on that word.

Consecutive links between two sentences are treated as a phrase and, thus, a single link. For example, if the phrase economic retrenchment had appeared in both sentences then it would count as only one link.

The links between each pair of sentences in a text are recorded in a connectivity matrix. If the number of links between a pair of sentences (from different articles) is higher than the selected link threshold, the sentences are said to have bonded. Sentences over a given bond threshold (i.e. bonded with more than a specified number of other sentences) are considered to be core bearers of information. The number of bonded sentences between a pair of texts is aggregated and this count is higher for texts on similar topics.  

This method of calculation of aboutness is particularly relevant in comparing newspaper articles which, as relatively short texts, are typically expressing just one main idea or proposition and developing this sentence by sentence with little redundancy.

