The Gist

  • What is Google Bard? Google’s Bard is based on a scaled-down version of the Language Model for Dialogue Applications (LaMDA), which was trained on a dataset called Infiniset consisting of public dialog data and web text, including C4-based data, English language Wikipedia, code documents, web documents, non-English web documents, and dialogs data from public forums.
  • ChatGPT’s got nothing on us: Bard’s underlying technology is expected to be at least as good as ChatGPT and will be able to index the newest data, interpret it, and surface it in a more interesting way.
  • All about quality content. AI-driven search engines will offer opportunities, such as improved search results and customer engagement. However, businesses must adapt their SEO strategies to optimize content for AI and focus on high-quality content that provides value to users.

On Feb. 6, Google and Alphabet CEO Sundar Pichai announced in a blog post that Google would soon be incorporating its own ChatGPT-like generational AI into its Google search engine. Google Bard, as it’s called, is based upon Google’s Language Model for Dialogue Applications (LaMDA), was trained on datasets based on internet content called Infiniset and was pre-trained on 1.56 trillion words of “public dialog data and web text.”

Google’s announcement of Bard was soon followed by Microsoft’s announcement of its own AI-driven update to its Bing search engine, which is said to be more advanced than the recently viral ChatGPT application that OpenAI created.

Although there has been much enthusiasm about these AI-driven search engines by both the public and business sectors, there has also been some trepidation concerning how they will affect search engine optimization (SEO), search advertising, marketing and even website page views. This article will discuss those concerns and the opportunities this new technology will open up, as well as what we know about Google Bard so far.

What Was Bard Trained On?

According to Sundar Pichai’s blog post announcing Google Bard, the generative AI application is based on a scaled-down version of the Language Model for Dialogue Applications (LaMDA). Although the 2022 LaMDA research paper lists the percentages of different types of data that were used to train LaMDA, it is still quite vague. LaMDA was trained on a dataset called Infiniset, which is a blend of content that was selected to enhance the model’s ability to participate in dialogues. The Infiniset dataset includes the following:

  • 12.5% C4-based (Colossal Clean Crawled Corpus) data
  • 12.5% English language Wikipedia
  • 12.5% code documents from programming Q&A websites, tutorials and others
  • 6.25% English web documents
  • 6.25% Non-English web documents
  • 50% dialogs data from public forums

The C4 dataset is based on Common Crawl data, which is an open-source dataset that comes from a variety of websites that have been scraped. Common Crawl is a registered nonprofit organization that crawls the internet regularly to create free datasets that are available for training AI applications. Although many websites are scraped to create the dataset, the top 25 websites that are included in C4 are:

Publicité
  • patents.google.com
  • en.wikipedia.org
  • en.m.wikipedia.org
  • www.nytimes.com
  • www.latimes.com
  • www.theguardian.com
  • journals.plos.org
  • www.forbes.com
  • www.huffpost.com
  • patents.com
  • www.scribd.com
  • www.washingtonpost.com
  • www.fool.com
  • ipfs.io
  • www.frontiersin.org
  • www.businessinsider.com
  • www.chicagotribune.com
  • www.booking.com
  • www.theatlantic.com
  • link.springer.com
  • www.aljazeera.com
  • www.kickstarter.com
  • caselaw.findlaw.com
  • www.ncbi.nlm.nih.gov
  • www.npr.org

Half of the data that was used to train LaMDA comes from “dialogs data from public forums,” and although there are no specific sites mentioned in the paper, they are likely to include large community discussion sites such as Reddit and Quora, among others.

Raymond Velez, global chief technology officer at digital consultancy Publicis Sapient, told CMSWire that with BARD, Google not only is responding to OpenAI’s products, but making the very strong case that much of ChatGPT’s technical capabilities — the “T” in ChatGPT (the GPT stands for generative pre-trained transformer) — are based on the transformers that Google’s AI scientists pioneered. “The expectation is BARD’s underlying technology will be at least as good as ChatGPT,” said Velez. “Another Google advantage is that the data set it has is not limited to data from 2021 so if it can combine being able to index the newest data and then interpret and surface it so that it will be more interesting.”

Related Article: Google Announces ChatGPT Rival ‘Bard,’ the AI-Powered Search

How Will Bard Display Search Results?

Although details about how Bard will display search results are extremely limited, there are some details one can glean from the Google Blog announcement. Aside from the dataset that Bard was trained on, Bard will be able to use current information from the internet to generate responses to search queries. The following interaction was included in the blog:

James Webb Telescope

Inaccurate details in the Bard response shown above were quickly pointed out by NewScientist, who reported that Grant Tremblay at the Harvard–Smithsonian Center for Astrophysics tweeted that the third statement in the example was untrue, an error that caused Alphabet’s stock to drop 9% in one day after the mistake was publicized. More important to marketers and advertisers is specifically how search results will be impacted by Bard.

The blog stated that users will soon see “AI-powered features in Search that distill complex information and multiple perspectives into easy-to-digest formats, so you can quickly understand the big picture and learn more from the web.”

Although Google has not thus far discussed the specifics of the way that search engine results will be displayed, an image from the Google blog showed what the Bard results may look like on a mobile device:

Mobile Device

This is indicative that the search results will first show a summary that has been generated by Bard in a chat format, followed by the traditional search results. One is left wondering if there will be citations for the original sources that were used to create the summary. Microsoft’s recent announcement that its Bing search engine will soon be powered by AI included details about the way that search engine results will be displayed, relating that original sources for the material will be cited and linked in the results. Microsoft also provided examples of search engine results, with the AI chat component’s summary displayed to the right of search results, complete with links to sources:

Links To Sources

What Are the Implications for Website Traffic?

Because little is known about the way Google search engine results will be impacted by Bard, discussion of the implications for website traffic are speculative. Although the example that Google CEO Sundar Pichai provided in the blog do not show links to the original sources, given that Microsoft has indicated that it will cite original sources in the AI-generated summary text, it is likely that Google will do so as well.

A larger question is whether the inclusion of AI-generated results to search engine queries is designed around the idea of a new paradigm for search in which users are able to obtain complete answers to their questions without a need to visit another website. This isn’t completely new for Google, as to a certain degree, it has been displaying answers to specific questions for some time now via its Featured Snippets.

Google Featured Snippets

It is unknown whether Google might display an AI chat window to the right of search results so that users can refine the results and continue the discussion with Bard, as Microsoft has done with its AI-driven Bing. The impact on website traffic will depend on whether or not Google will display links to original sources of summarized material. Additionally, if Google users can find a complete answer to their queried question, they may not be as likely to need additional information, which will limit the traffic to the original source websites.

Bob Rogers, data scientist and CEO at Oii.ai, a data science company specializing in supply chain modeling, told CMSWire that AI-driven search engines can potentially disrupt how websites are ranked. « SEO and paid ads get top billing in Google search results right now, even if they are irrelevant to what you’re actually looking for,” said Rogers. “In an AI-assisted search world, the AI chatbot will return a synthesized, easy-to-consume result with callouts to individual web pages.” Rogers explained that the ranking that determines which sources get called out, and how they are displayed by the chatbot, will replace current search ranking models and could completely disrupt the current SEO status quo.

Many content creators and businesses may resort to restructuring much of their content to provide answers to specific questions, similar to current SEO practices that are used to get listed in Google’s featured snippets:

More Google Featured Snippets

Rogers believes that this new ranking system might pose a partial solution to ethical concerns that AI chatbots draw heavily on sources that they don’t credit. “In the case of assimilating web page results and calling out sources to generate paid search revenue, the attribution to source material may be dramatically increased because advertised content sources only pay if they end up with a measurable outcome (click, conversion, etc.).”

Related Article: ChatGPT and Google Bard Have Company: Baidu’s ERNIE Bot

Bard May Not Be Ready for Prime Time

While Google Bard and associated AI technologies may have broad ramifications for marketing, advertising, SEO and the way the public uses search, there are many who believe that Bard is not yet ready for prime time, including Alphabet’s own Chair John Hennessy. In an interview with CNBC, Hennessy admitted that in his opinion, the technology is not yet product ready, stating that « Google was hesitant to productize this because it didn’t think it was really ready for a product yet. »

Hennessy went on to say that part of Google’s hesitance to announce Bard was because it was still generating inaccurate information in response to queries. “You don’t want to put a system out that either says wrong things or sometimes says toxic things, » Hennessy told CNBC in the interview.

Google employees were also quick to denounce the rushed Bard announcement and have been posting dismissive memes on Google’s internal forum Memegen about how the Bard launch was handled, pointing fingers at CEO Sundar Pichai for what many see as a preemptive derailment of the product.

The criticism from employees is similar to the public reception of the announcement, which came a day before Microsoft was to announce its own generative AI chat addition to its Bing search engine. Google attempted to steal the spotlight from Microsoft, and aside from the factual blunder, the blog post which announced Bard contained limited information about the functionality and abilities of the product, and simply caused speculation about how it will be integrated into the Google search engine.

Final Thoughts on Google Bard

Generative AI will undoubtedly impact the public perception and use of search, and the implications for publishers, marketers and advertisers are broad. Google’s Bard, when it is ready, will also change SEO practices more than previous search algorithm updates, forcing content creators to focus on creating informative content that answers more complex or detailed questions. Bard also raises questions concerning attribution, and the impact it will have on website traffic, because if it can effectively summarize a complete answer to a query, users will have no need to click on to the website(s) that the information was derived from.


Rate this post
Publicité
Article précédent« Les World’s Biggest Mobile Games Awards reviennent en avril ! »
Article suivant‘Les Gardiens de la Galaxie Vol. 3’ Avoir une action de style anime divise le fandom MCU en plein milieu
Avatar De Violette Laurent
Violette Laurent est une blogueuse tech nantaise diplômée en communication de masse et douée pour l'écriture. Elle est la rédactrice en chef de fr.techtribune.net. Les sujets de prédilection de Violette sont la technologie et la cryptographie. Elle est également une grande fan d'Anime et de Manga.

LAISSER UN COMMENTAIRE

S'il vous plaît entrez votre commentaire!
S'il vous plaît entrez votre nom ici