Scaling up search engine audits (2024)

research-article

Open Access

  • Authors:
  • Roberto Ulloa Computational Social Science, GESIS – Leibniz-Institute for the Social Sciences, Germany

    Computational Social Science, GESIS – Leibniz-Institute for the Social Sciences, Germany

    Scaling up search engine audits (1)https://orcid.org/0000-0002-9870-5505

    Search about this author

    ,
  • Mykola Makhortykh Institute of Communication and Media Studies, University of Bern, Switzerland

    Institute of Communication and Media Studies, University of Bern, Switzerland

    Search about this author

    ,
  • Aleksandra Urman Social Computing Group, University of Zurich, Switzerland

    Social Computing Group, University of Zurich, Switzerland

    Search about this author

Journal of Information ScienceVolume 50Issue 2Apr 2024pp 404–419https://doi.org/10.1177/01655515221093029

Published:16 April 2024Publication History

  • 0citation
  • 0
  • Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Publisher Site

Journal of Information Science

Volume 50, Issue 2

PreviousArticleNextArticle

Scaling up search engine audits (2)

Skip Abstract Section

Abstract

Algorithm audits have increased in recent years due to a growing need to independently assess the performance of automatically curated services that process, filter and rank the large and dynamic amount of information available on the Internet. Among several methodologies to perform such audits, virtual agents stand out because they offer the ability to perform systematic experiments, simulating human behaviour without the associated costs of recruiting participants. Motivated by the importance of research transparency and replicability of results, this article focuses on the challenges of such an approach. It provides methodological details, recommendations, lessons learned and limitations based on our experience of setting up experiments for eight search engines (including main, news, image and video sections) with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections, with diverse experimental designs, and point to different changes and strategies that improve the quality of the method. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time, and we hope that this article can serve as a basis for further research in this area.

References

  1. [1] Gillespie T. The relevance of algorithms. Media Technol Essays Commun Mater Soc 2014; 167: 167.Google ScholarScaling up search engine audits (3)
  2. [2] Noble SU. Algorithms of oppression: how search engines reinforce racism. New York: New York University Press, 2018.Google ScholarScaling up search engine audits (4)Cross Ref
  3. [3] O’Neil C. Weapons of math destruction: how big data increases inequality and threatens democracy. New York: Crown, 2016.Google ScholarScaling up search engine audits (6)
  4. [4] Mittelstadt B. Automation, algorithms, and politics| auditing for transparency in content personalization systems. Int J Commun 2016; 10: 12.Google ScholarScaling up search engine audits (7)
  5. [5] Bandy J. Problematic machine behavior: a systematic literature review of algorithm audits. ArXiv210204256 Cs, http://arxiv.org/abs/2102.04256 (2021, accessed 23 April 2021).Google ScholarScaling up search engine audits (8)
  6. [6] Diakopoulos NTrielli DStark J et al.. I vote for – how search informs our choice of candidate. In: Moore MTambini D (eds) Digital dominance: the power of Google, Amazon, Facebook, and Apple. Oxford: Oxford University Press, 2018, p. 22.Google ScholarScaling up search engine audits (9)
  7. [7] Hu DJiang SRobertson RE et al.. Auditing the partisanship of Google Search snippets. In: The World Wide Web conference, San Francisco, CA, 13–17 May 2019, pp. 693704. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (10)
  8. [8] Kulshrestha JEslami MMessias J et al.. Quantifying search bias: investigating sources of bias for political searches in social media. In: Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing, Portland, OR, 25 February–1 March 2017, pp. 417432. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (11)
  9. [9] Metaxa DPark JSLanday JA et al.. Search media and elections: a longitudinal investigation of political search results. Proc ACM Hum Comput Interact 2019; 3: 129112917.Google ScholarScaling up search engine audits (12)
  10. [10] Trielli DDiakopoulos N. Search as news curator: the role of Google in shaping attention to news information. In: Proceedings of the 2019 CHI conference on human factors in computing systems, Glasgow, 4–9 May 2019, pp. 115. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (13)
  11. [11] Urman AMakhortykh MUlloa R. The matter of chance: auditing web search results related to the 2020 U.S. presidential primary elections across six search engines. Soc Sci Comput Rev. Epub ahead of print 28 April 2021. DOI: 10.1177/08944393211006863.Google ScholarScaling up search engine audits (14)Digital Library
  12. [12] Courtois CSlechten LCoenen L. Challenging Google Search filter bubbles in social and political information: disconforming evidence from a digital methods case study. Telemat Inform 2018; 35: 20062015.Google ScholarScaling up search engine audits (16)Cross Ref
  13. [13] Cozza VHoang VTPetrocchi M et al.. Experimental measures of news personalization in Google News. In: Casteleyn SDolog PPautasso C (eds) Current trends in web engineering. Cham: Springer International Publishing, 2016, pp. 93104.Google ScholarScaling up search engine audits (18)
  14. [14] Haim MGraefe ABrosius H-B. Burst of the filter bubble? Effects of personalization on the diversity of Google News. Digit Journal 2018; 6: 330343.Google ScholarScaling up search engine audits (19)Cross Ref
  15. [15] Puschmann C. Beyond the bubble: assessing the diversity of political search results. Digit Journal 2019; 7: 824843.Google ScholarScaling up search engine audits (21)Cross Ref
  16. [16] Robertson REJiang SJoseph K et al.. Auditing partisan audience bias within Google Search. Proc ACM Hum Comput Interact 2018; 2: 148114822.Google ScholarScaling up search engine audits (23)
  17. [17] Robertson RELazer DWilson C. Auditing the personalization and composition of politically-related search engine results pages. In: Proceedings of the 2018 World Wide Web conference, Lyon, 23–27 April 2018, pp. 955965. Geneva: International World Wide Web Conferences Steering Committee.Google ScholarScaling up search engine audits (24)
  18. [18] Hannak ASapiezynski PMolavi Kakhki A et al.. Measuring personalization of web search. In: Proceedings of the 22nd international conference on World Wide Web – WWW ’13, Rio de Janeiro, Brazil, 13–17 May 2013, pp. 527538. New York: ACM Press.Google ScholarScaling up search engine audits (25)
  19. [19] Kliman-Silver CHannak ALazer D et al.. Location, location, location: the impact of geolocation on web search personalization. In: Proceedings of the 2015 Internet measurement conference, Tokyo, Japan, 28–30 October 2015, pp. 121127. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (26)
  20. [20] Otterbacher JBates JClough P. Competent men and warm women: gender stereotypes and backlash in image search results. In: Proceedings of the 2017 CHI conference on human factors in computing systems, Denver, CO, 6–11 May 2017, pp. 66206631. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (27)
  21. [21] Singh VKChayko MInamdar R et al.. Female librarians and male computer programmers? Gender bias in occupational images on digital media platforms. J Assoc Inf Sci Technol 2020; 71: 12811294.Google ScholarScaling up search engine audits (28)Digital Library
  22. [22] Makhortykh MUrman AUlloa R. Detecting race and gender bias in visual representation of AI on web search engines. In: Boratto LFaralli SMarras M et al.. (eds) Advances in bias and fairness in information retrieval. Cham: Springer International Publishing, 2021, pp. 3650.Google ScholarScaling up search engine audits (30)
  23. [23] Cano-Orón L.. Google, what can you tell me about homeopathy? Comparative study of the top10 websites in the United States, United Kingdom, France, Mexico and Spain. Prof Inf 2019; 28: e280212.Google ScholarScaling up search engine audits (31)
  24. [24] Haim MArendt FScherr S. Abyss or shelter? On the relevance of web search engines’ search results when people Google for suicide. Health Commun 2017; 32: 253258.Google ScholarScaling up search engine audits (32)Cross Ref
  25. [25] Makhortykh MUrman AUlloa R. How search engines disseminate information about COVID-19 and why they should do better. Harv Kennedy Sch Misinformation Rev 2020; 1: 112.Google ScholarScaling up search engine audits (34)
  26. [26] Fischer SJaidka KLelkes Y. Auditing local news presence on Google News. Nat Hum Behav 2020; 4: 12361244.Google ScholarScaling up search engine audits (35)Cross Ref
  27. [27] Lurie EMustafaraj E. Opening up the black box: auditing Google’s top stories algorithm. Proc Int Fla Artif Intell Res Soc Conf 2019; 32: 376382, https://par.nsf.gov/biblio/10101277-opening-up-black-box-auditing-googles-top-stories-algorithm (accessed 7 May 2021).Google ScholarScaling up search engine audits (37)
  28. [28] Nechushtai ELewis SC. What kind of news gatekeepers do we want machines to be? Filter bubbles, fragmentation, and the normative dimensions of algorithmic recommendations. Comput Hum Behav 2019; 90: 298307.Google ScholarScaling up search engine audits (38)Cross Ref
  29. [29] Urman AMakhortykh MUlloa R. Auditing source diversity bias in video search results using virtual agents. In: Companion proceedings of the web conference, Ljubljana, 19–23 April 2021, pp. 232236. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (40)
  30. [30] Hussein EJuneja PMitra T. Measuring misinformation in video search platforms: an audit study on YouTube. Proc ACM Hum Comput Interact 2020; 4: 48.Google ScholarScaling up search engine audits (41)
  31. [31] Makhortykh MUrman AUlloa R. Hey, Google, is this what the Holocaust looked like? Auditing algorithmic curation of visual historical content on web search engines. First Monday. Epub ahead of print 4 October 2021. DOI: 10.5210/fm.v26i10.11562.Google ScholarScaling up search engine audits (42)Cross Ref
  32. [32] Zavadski AToepfl F. Querying the Internet as a mnemonic practice: how search engines mediate four types of past events in Russia. Media Cult Soc 2019; 41: 2137.Google ScholarScaling up search engine audits (44)
  33. [33] McMahon CJohnson IHecht B. The substantial interdependence of Wikipedia and Google: a case study on the relationship between peer production communities and information technologies. Proc Int AAAI Conf Web Soc Media 2017; 11, https://ojs.aaai.org/index.php/ICWSM/article/view/14883 (accessed 7 May 2021).Google ScholarScaling up search engine audits (45)
  34. [34] Vincent NJohnson ISheehan P et al.. Measuring the importance of user-generated content to search engines. Proc Int AAAI Conf Web Soc Media 2019; 13: 505516.Google ScholarScaling up search engine audits (46)
  35. [35] Haim M. Agent-based testing: an automated approach toward artificial reactions to human behavior. Journal Stud 2020; 21: 895911.Google ScholarScaling up search engine audits (47)Cross Ref
  36. [36] Datta ATschantz MCDatta A. Automated experiments on ad privacy settings. Proc Priv Enhancing Technol 2015; 2015: 92112.Google ScholarScaling up search engine audits (49)Cross Ref
  37. [37] McCown FNelson ML. Agreeing to disagree: search engines and their public interfaces. In: Proceedings of the 7th ACM/IEEE-CS joint conference on digital libraries, Vancouver, BC, Canada, 18–23 June 2007, pp. 309318. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (51)
  38. [38] Jimmy Zuccon GDemartini G. On the volatility of commercial search engines and its impact on information retrieval research. In: The 41st International ACM SIGIR conference on research & development in information retrieval, Ann Arbor, MI, 8–12 July 2018, pp. 11051108. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (52)
  39. [39] Bodo BHelberger NIrion K et al.. Tackling the algorithmic control crisis – the technical, legal, and ethical challenges of research into algorithmic agents. Yale J Law Technol 2018; 19: 133180, https://digitalcommons.law.yale.edu/yjolt/vol19/iss1/3Google ScholarScaling up search engine audits (53)
  40. [40] Möller Jvan de Velde RNMerten L et al.. Explaining online news engagement based on browsing behavior: creatures of habit? Soc Sci Comput Rev 2020; 38: 616632.Google ScholarScaling up search engine audits (54)Digital Library
  41. [41] Mattu SYin LWaller A et al.. How we built a Facebook inspector. The Markup, 5 January 2021, https://themarkup.org/citizen-browser/2021/01/05/how-we-built-a-facebook-inspector (accessed 6 May 2021).Google ScholarScaling up search engine audits (56)
  42. [42] Feuz MFuller MStalder F. Personal web searching in the age of semantic capitalism: diagnosing the mechanisms of perso nalisation. First Monday. Epub ahead of print February 2011. DOI: 10.5210/fm.v16i2.3344.Google ScholarScaling up search engine audits (57)Cross Ref
  43. [43] Mikians JGyarmati LErramilli V et al.. Detecting price and search discrimination on the internet. In: Proceedings of the 11th ACM workshop on hot topics in networks, Redmond, WA, 29–30 October 2012, pp. 7984. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (59)
  44. [44] Scherr SHaim MArendt F. Equal access to online information? Google’s suicide-prevention disparities may amplify a global digital divide. New Media Soc 2019; 21: 562582.Google ScholarScaling up search engine audits (60)
  45. [45] Urman AMakhortykh MUlloa R. Visual representation of migrants in web search results, https://boris.unibe.ch/156714/Google ScholarScaling up search engine audits (61)
  46. [46] Meyers PJ. YouTube dominates Google video in 2020. MOZ, 14 October 2020, https://moz.com/blog/youtube-dominates-google-video-results-in-2020 (accessed 6 May 2021).Google ScholarScaling up search engine audits (62)
  47. [47] Schechner SGrind KWest J. Searching for video? Google pushes YouTube over rivals. The Wall Street Journal, 14 July 2020, https://www.wsj.com/articles/google-steers-users-to-youtube-over-rivals-11594745232 (accessed 6 May 2021).Google ScholarScaling up search engine audits (63)
  48. [48] Asplund JEslami MSundaram H et al.. Auditing race and gender discrimination in online housing markets. Proc Int AAAI Conf Web Soc Media 2020; 14: 2435.Google ScholarScaling up search engine audits (64)
  49. [49] Hannak ASoeller GLazer D et al.. Measuring price discrimination and steering on E-commerce web sites. In: Proceedings of the 2014 conference on internet measurement conference, Vancouver, BC, Canada, 5–7 November 2014, pp. 305318. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (65)
  50. [50] Hupperich TTatang DWilkop N et al.. An empirical study on online price differentiation. In: Proceedings of the eighth ACM conference on data and application security and privacy, Tempe, AZ, 19–21 March 2018, pp. 7683. New York: Association for Computing Machinery.Google ScholarScaling up search engine audits (66)Digital Library
  51. [51] Eriksson MCJohansson A. Tracking gendered streams. Cult Unbound 2017; 9: 163183.Google ScholarScaling up search engine audits (68)Cross Ref
  52. [52] Snickars P. More of the same – on Spotify radio. Cult Unbound 2017; 9: 184211.Google ScholarScaling up search engine audits (70)
  53. [53] Chakraborty AGanguly N. Analyzing the news coverage of personalized newspapers. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Barcelona, 28–31 August 2018, pp. 540543. New York: IEEE.Google ScholarScaling up search engine audits (71)
  54. [54] WebBot Ulloa R. (3.2) [Computer software]. GESIS – Leibniz Institute for the Social Sciences, 2021, https://github.com/gesiscss/WebBot.Google ScholarScaling up search engine audits (72)
  55. [55] Aigenseer VUrman AChristner C et al.. Webtrack – desktop extension for tracking users’ browsing behaviour using screen-scraping, https://boris.unibe.ch/139219/Google ScholarScaling up search engine audits (73)
  56. [56] Chrome Developers. chrome.BrowsingData, https://developer.chrome.com/docs/extensions/reference/browsingData/ (2021, accessed 4 June 2021).Google ScholarScaling up search engine audits (74)
  57. [57] MDN Web Docs. browsingData.DataTypeSet. MDN Web Docs, 27 October 2021, https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/browsingData/DataTypeSet (accessed 4 June 2021).Google ScholarScaling up search engine audits (75)
  58. [58] Search blocking and captcha – captcha. Feedback, https://yandex.com/support/captcha/ (accessed 20 April 2021).Google ScholarScaling up search engine audits (76)

Cited By

View all

Scaling up search engine audits (77)

    Recommendations

    • Auditing the Partisanship of Google Search Snippets

      WWW '19: The World Wide Web Conference

      The text snippets presented in web search results provide users with a slice of page content that they can quickly scan to help inform their click decisions. However, little is known about how these snippets are generated or how they relate to a user's ...

      Read More

    • Fighting search engine amnesia: reranking repeated results

      SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

      Web search engines frequently show the same documents repeatedly for different queries within the same search session, in essence forgetting when the same documents were already shown to users. Depending on previous user interaction with the repeated ...

      Read More

    • Google Search Engine: Seo Tools You Need to Explode Your Website Traffic - Google Seo, Google Ranking

      Read More

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Article

    • Information
    • Contributors
    • Published in

      Scaling up search engine audits (78)

      Journal of Information Science Volume 50, Issue 2

      Apr 2024

      264 pages

      ISSN:0165-5515

      Issue’s Table of Contents

      © The Author(s) 2022

      This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

      Sponsors

        In-Cooperation

          Publisher

          Sage Publications, Inc.

          United States

          Publication History

          • Published: 16 April 2024

          Author Tags

          • Algorithm auditing
          • data collection
          • search engine audits
          • user modelling

          Qualifiers

          • research-article

          Conference

          Funding Sources

          • Scaling up search engine audits (79)

            Other Metrics

            View Article Metrics

          • Bibliometrics
          • Citations0
          • Article Metrics

            • Total Citations

              View Citations
            • Total Downloads

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            View Author Metrics

          • Cited By

            This publication has not been cited yet

          Digital Edition

          View this article in digital edition.

          View Digital Edition

          • Figures
          • Other

            Close Figure Viewer

            Browse AllReturn

            Caption

            View Issue’s Table of Contents

            Export Citations

              Scaling up search engine audits (2024)
              Top Articles
              Latest Posts
              Article information

              Author: Nicola Considine CPA

              Last Updated:

              Views: 5651

              Rating: 4.9 / 5 (69 voted)

              Reviews: 84% of readers found this page helpful

              Author information

              Name: Nicola Considine CPA

              Birthday: 1993-02-26

              Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

              Phone: +2681424145499

              Job: Government Technician

              Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

              Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.