The Normalized Google Distance (NGD) is a semantic similarity measure derived from the number of hits returned by the Google search engine for a given set of keywords. Keywords with the same or similar meanings in a natural language sense tend to be 'close' in units of Normalized Google Distance, while words with dissimilar meanings tend to be farther apart. The Normalized Google Distance (NGD) is a semantic similarity measure derived from the number of hits returned by the Google search engine for a given set of keywords. Keywords with the same or similar meanings in a natural language sense tend to be 'close' in units of Normalized Google Distance, while words with dissimilar meanings tend to be farther apart. Specifically, the Normalized Google Distance (NGD) between two search terms x and y is where N is the total number of web pages searched by Google multiplied by the average number of singleton search terms occurring on pages; f(x) and f(y) are the number of hits for search terms x and y, respectively; and f(x, y) is the number of web pages on which both x and y occur. If the N G D ( x , y ) = 0 {displaystyle NGD(x,y)=0} then x and y are viewed as alike as possible, but if N G D ( x , y ) ≥ 1 {displaystyle NGD(x,y)geq 1} then x and y are very different.If the two search terms x and y never occur together on the same web page, but do occur separately, the NGD between them is infinite. If both terms always occur together, their NGD is zero. Example: On 9 April 2013, googling for 'Shakespeare' gave 130,000,000 hits;googling for 'Macbeth' gave 26,000,000 hits; and googlingfor 'Shakespeare Macbeth' gave 20,800,000 hits.The number of pages indexed by Google was estimated by the numberof hits of the search term 'the,' which was 25,270,000,000 hits. Assumingthere are about 1,000 search terms on the average page this gives N = 25 , 270 , 000 , 000 , 000 {displaystyle N=25,270,000,000,000} .Hence 'Shakespeare' and 'Macbeth' arevery much alike according to the relative semantics supplied by Google. The Normalized Google Distance is derived from the earlier Normalized Compression Distance..Namely, objects can be given literally, like the literal four-letter genome of a mouse,or the literal text of Macbeth by Shakespeare. The similarity of these objects is given by the NCD. Forsimplicity we take it that all meaning of the objectis represented by the literal object itself. Objects can also begiven by name, like 'the four-letter genome of a mouse,'or 'the text of Macbeth by Shakespeare.' There arealso objects that cannot be given literally, but only by name,and that acquire their meaning from their contexts in background commonknowledge in humankind, like 'home' or 'red.' The similarity between names for objects is given by the NGD. The probabilities of Google search terms, conceived asthe frequencies of page counts returned by Google divided bythe number of pages indexed by Google (multplied by the average numberof search terms in those pages),approximate the actual relative frequencies of those search termsas actually used in society. Based on this premise,the relations represented by thenormalized Google distance approximately capturethe assumed true semanticrelations governing the search terms. In the NGD the World Wide Weband Google is used. Other text corporacan be Wikipedia, the King James version of theBible or the Oxford English Dictionary together with appropriate search engines. The following properties are proved in: