Blog

Blog Image

Women’s History Month: Karen Sparck Jones, Originator of the Search Engine

Karen Sparck Jones was a British pioneering computer scientist. She was a self taught computer programmer and is best known for the concept of inverse document frequency, which is the tech that helped to establish the basis of search engines such as Google.

How It All Started

Sparck Jones was born in Huddersfield, England, on the 26th of August, 1935. She studied History and Philosophy at Girton College in Cambridge, UK. Whilst studying at Cambridge she met Margaret Masterman, the head of the Cambridge Language Research Unit, who inspired her to enter this line of work. Sparck Jones had described Masterman as “a very strange and interesting woman.” Masterman had kept her maiden name professionally after getting married, which, at that time, was rather rare. Sparck Jones did the same though, after marrying Roger Needham, fellow computer scientist, in 1958. In regards to this she said that “it maintains a permanent existence of your own.”

Once she graduated in 1956 she became a school teacher, briefly. However, she quickly went on to work for Masterman in the late 1950s, at the Cambridge Language Research Unit. Her goal was to learn how to program a computer to understand the various meanings of a word, basically programming a huge thesaurus. In 1964, she published a foundational paper in the natural language processing field, ‘Synonymy and Semantic Classification’. 

IDF

Throughout her career, Sparck Jones focused her research on natural language processing and information retrieval. In her 1972 seminal paper in the Journal of Documentation she introduced the concept of inverse document frequency (IDF). The IDF is a statistic that measures how important a word is to a document, and is used as a weighting factor in information retrieval and text mining. This came to be her most important work as, today, search engines use IDF as part of the term frequency-inverse document frequency weighting scheme. It ultimately helps dictate where a document or page should appear in the search results.  

Ahead Of Her Time

At the time that Sparck Jones was working, computer scientists were mainly focusing on making people use code to talk to computers. However, Jones was teaching computers to understand human language. A friend of Sparck Jones, John Trait, said, “a lot of the stuff she was working on until 5 or 10 years ago seemed like mad nonsense, and now we take it for granted.” With the brand name ‘Google’ almost becoming a verb now, this couldn’t be more true. 

To this day, researchers are still citing her formulas which shows that she was ahead of her time. She was also forward-thinking in terms of the impact computing could have on society, which is something that is still being tackled today. She said, “there is an interaction between the context and the programming task itself…you don’t need a fundamental philosophical discussion every time you put your finger to keyboard, but as computing is spreading so far into people’s lives, you need to think about these things.”

Advocate For Women In Computing

In 1982, the British government approached Sparck Jones to work on the Alvey Program. This was an initiative brought about to inspire more computer science research across the country.  Sparck Jones was also a huge advocate for getting more women into computing and had said “I think it’s very important to get more women into computing. My slogan is: Computing is too important to be left to men.”

In 1994, she became president of the Association for Computational Liguistics and became a full time professor at Cambridge University by 1999. It did bother her though, that it took so long for her to be in this position. Before this she was on contract with the university which was non-permanent and a lower status of academic employment. She attributed this delay in her career to the fact that “Cambridge was in many ways not user-friendly, in the sense of women-friendly.”

Later Life

Sparck Jones worked at the Cambridge University Computer Laboratory, until she retired in 2002. Unfortunately, her life and work was relatively overlooked for over 20 years. This was apparent when her husband received an obituary in The Times but she did not. She passed away from cancer on the 4th of April 2007and her husband passed in 2003. In 2019, The New York Times included Sparck Jones in the series ‘Overlooked’, which, in their words is a “series of obituaries about remarkable people whose deaths, beginning in 1851, went unreported in The Times.” 

Recognition

Sparck Jones was awarded the honorary degree of Doctor of Science by the Vice Chancellor of City University London, Raoul Franklin. 

To further honour her achievements, an annual award was established in 2008 by the BCS Information Retrieval Specialist Group. The Karen Sparck Jones award promotes talented researchers who significantly strive to advance the understanding of Natural Language Processing or Information Retrieval. 

In her hometown, the University of Huddersfield renamed one of their buildings ‘the Sparck Jones building’, and this is the home of the university’s School of Computing and Engineering. 

Here are just a few of her other achievements; 

1988 – awarded the Gerard Salton Award by the Association of Computing Machinery (ACM)

2004 – awarded the Lifetime Achievement Award by the Association for Computational Linguistics (ACL)

2007 – awarded the BCS Lovelace Medal 

2007 – awarded the Women’s Group Athena Award by the ACM

References;

“Overlooked no more: Karen Sparck Jones, who established the basis for search engines”. Bowles, N. January 2019

“Karen Sparck Jones (1935-2007). Wilks, Y. June 2007

“Women’s History Month”. Pracy, V. March 2019

“Karen Ida Boalth Sparck Jones 1935-2007”. Pulman, S.G. January 2012

“Computer science, a woman’s work”. IEE SPECTRUM. May 2007

Image via source

  • Written by Mikita Maru, March 30 2022