A rough timeline of AI development

A rough timeline of “milestones” in the appearence of Artificial Intelligence. Emphasis on recent developments.

AI
2024
Apr 24

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Microsoft Research presented VASA, a AI framework that – from a single static image and a speech audio clip – can create lifelike talking faces of virtual characters with appealing visual affective skills.

 

 

 

VASA-1, is capable of not only producing lip movements that are exquisitely synchronized with the audio, but also capturing a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness.

This development paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors.

 

VASA-1 website

 

Read More
Apr 24

SORA – Creating video from text. AI model that can create realistic and imaginative scenes from text instructions.

Prompt: “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.”

 

Prompt: “Create a animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”

 

Example videos of Sora’s capabilities

 

Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.

 

Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

 

The model has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions. Sora can also create multiple shots within a single generated video that accurately persist characters and visual style.

Read More
Apr 20

AI Search Engines and Browser AI Plugins take over more and more from the normal Google Online Search


What is an AI search engine?
An AI search engine is a search platform that uses machine learning models and artificial intelligence to provide results. With the help of other AI-based algorithms and natural language processing, it can understand more complex search queries and provide improved, more relevant, and accurate results to users.

 

Moreover, an AI search engine automatically generates more accurate search results and detailed, unique answers to complex questions. Instead of relying on keywords, these tools examine your queries and check your preferences, past interactions, interests, and behavior to provide a personalized answer that follows your requirements.

 

How does an AI-based search engine work?

AI search engines work similarly to their traditional counterparts but employ additional AI technologies, algorithms, and deep learning to understand queries better and offer improved results.

 

Here’s what a typical AI search engine does:

 

An AI search engine begins its search by indexing content from the web.
Once a user types in a question, the tool uses natural language processing to understand the meaning and intent of the query.
The engine determines the relevance of its indexed items to the query and assigns a relevancy score to each item.
The AI-based search engine provides the most relevant results at the top and goes down the list based on relevancy scores.
Moreover, machine learning models enable AI search engines to improve based on historical data and offer more relevant content and more accurate results after every interaction.

 

AI-driven search engine vs traditional search engine:

 

AI search engines can understand natural language queries and content better than regular search engines can.
Where regular search engines use only words and phrases to determine their results, AI tools consider the context and actual meaning behind the question.

AI-based search tools can cover advanced features like image search or voice search.
Traditional search engines work from scratch with every query, while their AI counterparts can learn from past interactions and improve over time as a result of machine learning.

AI search engines can answer complex questions and longer texts instead of being limited to simple queries.

Unlike traditional search engines, most AI-based ones offer customized and personalized search experiences.

AI search engines can interact conversationally and handle follow-up questions, whereas conventional search engines can only answer queries.

 

Here’s a list of AI search engines for browsers, along with links to their relevant pages
The list is relevant for Feb. 2024, but new startups and developments happen quickly now.

 

Perplexity AI
Perplexity.ai
Perplexity.ai is an advanced answer engine founded by experts connected to DeepMind and OpenAI. It uses avant-garde AI, specifically OpenAI’s GPT-3 model, to provide direct answers.
However, instead of the traditional list of links, it offers summaries and citations as search results. This means that when users ask questions, the search engine not only provides relevant information but also presents concise summaries and references to relevant sources.

Moreover, Perplexity.ai offers an uninterrupted and straightforward browsing experience for users. It is add-free and does not require users to sign up or log in. More importantly, this AI search engine does not collect any personal information from its users, ensuring a secure and confidential search experience.

Anthropic’s Claude
–  Anthropic’s Claude
– Description: Claude is a conversational AI assistant developed by Anthropic that can search the internet, answer questions, and assist with various tasks.

 

Brave Search
Brave Search
– Description: Brave Search is a privacy-focused search engine developed by the Brave browser, which uses AI to filter out bias and deliver relevant results.

 

You.com
You.com
– Description: You.com is a search engine that uses AI to provide direct answers to queries instead of just listing web pages.

 

Neeva
Neeva
– Description: Neeva is an ad-free search engine powered by AI that aims to provide personalized and relevant search results without tracking user data.

 

Wolfram Alpha
Wolfram Alpha
– Description: Wolfram Alpha is a computational knowledge engine that uses AI to answer queries by computing answers from curated data.

 

Presearch
Presearch
– Description: Presearch is a decentralized, community-driven search engine that rewards users with cryptocurrency for searching and contributing to the platform.

 

Dejian
– dejian
– Description: Dejian is an AI-powered search engine that aims to provide more accurate and relevant search results by understanding natural language queries.

 

Askwise
Askwise
– Description: Askwise is an AI-powered search engine that uses natural language processing to provide direct answers to questions, rather than just listing web pages.

 

Yieko
Yieko
– Description: Yieko is an AI-powered search engine that uses natural language processing and machine learning to provide relevant and personalized search results.

 

Read More
Apr 11

Meta’s AI agent understands physical spaces via interactive conversational questions

This framework has the potential to advance the field of AI by enabling machines to interact with the world in a more human-like manner, facilitating applications such as virtual assistants, autonomous robots, and interactive storytelling systems.

 

The Open-Vocabulary Embodied Question Answering (OpenEQA) framework is a system designed to enable machines to answer questions about their surroundings in a more interactive and nuanced manner.

 

Unlike traditional question answering systems that rely heavily on predefined databases or knowledge graphs, OpenEQA aims to allow machines to understand and respond to questions about their environment in real-time, without prior knowledge.

 

This framework integrates techniques from natural language processing (NLP), computer vision, and reinforcement learning to enable agents to navigate and interact with their environment while understanding and responding to questions asked in natural language.

By combining these different modalities, OpenEQA strives to create more robust and versatile question answering systems that can adapt to various environments and scenarios.

 

Embodied AI: OpenEQA: From word models to world models

Read More
Feb 25

Some publications and open letters that have warned about the potential risks and implications of advanced AI development – 2014 – now

Some experts warn that artificial intelligence (AI) could pose catastrophic risks to national security and even human extinction.
A March 2024 report commissioned by the US state department and produced by Glassstone AI, a US-based company that promotes responsible AI development, states that advanced AI systems could be stolen, weaponized, or escape controls.
The report also says that AI labs are concerned about losing control of the systems they’re developing, which could have devastating consequences for global security.
The report’s title is “An Action Plan to Increase the Safety and Security of Advanced AI“.

An Action Plan to Increase the Safety and Security of Advanced AI

 

Other dangers of AI might be:
– Automation of jobs
– Spread of fake news
– Arms race of AI-powered weaponry
– Lack of transparency
– Bias
– Lack of protection for online data privacy
– Opportunity for people to cheat
– Overall human obsolescence

 

Some significant publications and open letters that have warned about the potential risks and implications of advanced AI development:

 

“Potential Risks from Advanced Artificial Intelligence” (2017)
An open letter signed by over 8,000 individuals, including prominent figures like Elon Musk and Stephen Hawking, expressing concerns about the potential risks of advanced AI and calling for increased research into AI safety.

Potential Risks from Advanced Artificial Intelligence

 

 

“Autonomous Weapons: An Open Letter from AI & Robotics Researchers” (2015)
– This open letter, signed by over 3,000 AI researchers, called for a ban on offensive autonomous weapons and expressed concerns about the potential dangers of artificial intelligence in warfare.

Autonomous Weapons: An Open Letter from AI & Robotics Researchers

 

“Research Priorities for Robust and Beneficial Artificial Intelligence” (2016)
– This publication by researchers from the Future of Humanity Institute and the Machine Intelligence Research Institute outlined key research priorities for ensuring that advanced AI systems remain robust and beneficial.

Research Priorities for Robust and Beneficial Artificial Intelligence

 

“Asilomar AI Principles” (2017)
– A set of principles developed at the Asilomar AI conference, which aimed to guide the development of beneficial AI systems and address issues such as safety, ethics, and social impact.

Asilomar AI Principles

 

“The Malicious Use of Artificial Intelligence” (2018)
– A report by researchers from various institutions, including the University of Cambridge and the Future of Humanity Institute, highlighting the potential risks of AI systems being used for malicious purposes.

The Malicious Use of Artificial Intelligence

 

“Windfall: The Path to Prosperity After Artificial Intelligence” (2019)
– A book by Stanford University professor Erik Brynjolfsson and researcher Andrew McAfee, which discussed the potential economic disruptions and societal implications of advanced AI.

https://erikbryn.com/windfall/

 

“The One Hundred Year Study on Artificial Intelligence” (2016-2021)
– A multiyear study by Stanford University aimed at analyzing the potential impacts of AI on various domains, including ethics, economics, and policy.

The One Hundred Year Study on Artificial Intelligence

 

“The Ethics of Artificial Intelligence” (2020)
– A report by the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems, which outlined ethical considerations and recommendations for the development and deployment of AI systems.

The Ethics of Artificial Intelligence

 

“Superintelligence: Paths, Dangers, Strategies” (2014)
– A book by philosopher Nick Bostrom, which explored the potential risks and existential threats posed by advanced artificial superintelligence.

Superintelligence: Paths, Dangers, Strategies

 

“AI Ethics in Research and Development” (2022)
– A report by the AI Ethics Board of the European Union, which provided guidance and recommendations for the ethical development and deployment of AI systems within the EU.

AI Ethics in Research and Development

 

These and other publications, open letters, and reports have raised awareness about the potential risks, ethical considerations, and societal implications of advanced AI development. They have sparked discussions and calls for responsible AI governance, safety research, and proactive measures to mitigate potential negative consequences.

 

Read More
Feb 22

Attempts to regulate the development Artificial Intelligence states and public institutions – 2018 – now

Artificial intelligence (AI) regulations are public sector laws and policies that promote and regulate AI. 
The goal of AI regulation is to balance innovation with managing risks. 
The basic approach to regulation focuses on the risks and biases of machine-learning algorithms, including the input data, algorithm testing, and decision model.

 

Here is an overview of significant AI regulation from states and institutions in recent years:

 

2018:
– The European Union introduced the General Data Protection Regulation (GDPR), which includes provisions related to automated decision-making and AI systems that process personal data.
– Link: gdpr.eu/

 

2019:
– The Organization for Economic Co-operation and Development (OECD) released its Principles on Artificial Intelligence, outlining recommendations for responsible and trustworthy AI development and deployment.
– Link: oecd.org/going-digital/ai/principles/

 

– The U.S. National Institute of Standards and Technology (NIST) released its Plan for Federal Engagement in AI Standards and prepared a framework for AI governance.
– Link: nist.gov/system/ai_standards_fedengagement_plan_9aug2019.pdf

 

2020:
– The European Commission released a white paper on AI, proposing regulatory measures for high-risk AI systems and laying the groundwork for future AI regulations.
– Link: ec.europa.eu/info/publications/white-paper-artificial-intelligence-european-approach-excellence-and-trust_en

 

– The U.S. Office of Management and Budget (OMB) released guidance for regulating AI applications in the federal government.
– Link:whitehouse.gov/wp-content/uploads/2020/11/M-21-06.pdf

 

2021:
– The European Commission proposed the Artificial Intelligence Act, a comprehensive set of rules governing the development, deployment, and use of AI systems across the European Union.
– Link: artificial-intelligence-act.europortal.info/

 

– The U.S. National Artificial Intelligence Initiative Act was signed into law, providing a strategic plan and funding for AI research and development.
– Link: congress.gov/bill/116th-congress/house-bill/6216

 

2022:
– The U.S. Federal Trade Commission (FTC) issued guidance on promoting truth, fairness, and equity in AI systems.
– Link: ftc.gov/ftc-issues-guidance-promoting-truth-fairness-equity-ai

 

– The United Kingdom established the Office for Artificial Intelligence and appointed a national AI adviser to lead the government’s AI strategy and governance efforts.
– Link: gov.uk/government/groups/office-for-artificial-intelligence

 

– The OECD released the AI Principles Practice, providing guidance on implementing the OECD AI Principles.
– Link: oecd.org/going-digital/ai/principles/

 

2023:
– The European Parliament and Council reached a provisional agreement on the proposed AI Act, paving the way for the first comprehensive AI regulation in the EU.
– Link: artificialintelligenceact.eu/

 

– The U.S. National AI Advisory Committee released its final report, providing recommendations for the responsible development and deployment of AI systems.
– Link: ai.gov/naiafr/

 

– The G7 countries agreed to a set of principles for the responsible development and use of AI, focusing on human rights, democratic values, and the rule of law.
– Link: g7germany.de/resource/blob/974430/2062292/768781ee9e8f69d4819318c87355ed31/2023-05-19-2-aia-en-data.pdf

 

These links provide access to official websites, reports, and documents related to the AI regulations and guidelines proposed or implemented by various states and institutions.

 

As AI technology continues to advance, these efforts aim to ensure the responsible and ethical development, deployment, and governance of AI systems, addressing issues such as privacy, fairness, transparency, and accountability.

 

Read More
Jan 01

The Artificial Intelligence Report 2024 – The state of AI

The AI Index report tracks, distills, and visualizes data related to artificial intelligence (AI). Its mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of AI.

Here a summary from 2023:

1. Al beats humans on some tasks, but not on all.
Al has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding.
Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning.

 

2. Tech Industry continues to dominate frontier Al research.
In 2023, industry produced 51 notable machine learning models, while academia contributed only 15. 

 

3. Frontier AI models get way more expensive.
According to Al Index estimates, the training costs of state-of-the-art Al models have reached unprecedented levels.
For example, OpenAl’s GPT-4 used an estimated $78 million worth of compute to train, while Google’s Gemini Ultra cost $191 million for compute.

 

4. The United States leads China, the EU, and the U.K. as the leading source of top Al
models.
In 2023, 61 notable Al models originated from U.S.-based institutions, far outpacing the European Union’s 21 and China’s 15.

 

5. Robust and standardized evaluations for LLM responsibility are seriously lacking.
New research from the Al Index reveals a significant lack of standardization in responsible Al reporting.
Leading developers, including OpenAl, Google, and Anthropic, primarily test their models against different internal Al benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top Al models.

 

6. Generative Al investment skyrockets.
Despite a decline in overall Al private investment last year, funding for generative Al surged, nearly octupling from 2022 to reach $25.2 billion.
Major players in the generative Al space, including OpenAl, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds.

 

7. Al makes workers more productive and leads to higher quality work.
In 2023, several studies assessed Al’s impact on labor, suggesting that Al enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated Al’s potential to bridge the skill gap between low- and high-skilled workers.
Still, other studies caution that using Al without proper oversight can lead to diminished performance.

 

8. Scientific progress accelerates even further, thanks to Al.
In 2022, Al began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related Al applications – from AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery.

 

9. The number of Al regulations in the United States sharply increases.
The number of Al-related regulations in the U.S. has risen significantly in the past year and over the last five years. In 2023, there were 25 Al-related regulations, up from just one in 2016. Last year alone, the total number of Al-related regulations grew by 56.3%.

 

10. People across the globe are more aware of Al’s potential impact -and more nervous.
A survey from Ipsos shows that, over the last year, the proportion of those who think Al will dramatically affect their lives in the next three to five years has increased from 60% to 66%. Moreover, 52% express nervousness toward Al products and services, marking a 13 percentage point rise from 2022. I

n America, Pew data suggests that 52% of Americans report feeling more concerned than excited about Al, rising from 37% in 2022.

Read More
Jan 01

AI Patents per Country and year show the speed of AI related inventions.

AI Patents per Country and year show the speed of Artificial Intelligence related inventions and also the “arms race between the US and China for top position.

Read More
2023
Apr 23

DALL-E and Midjourney & Co – Generative AI that create images “like magic”

DALL-E and Midjourney are both AI image generation tools that can create images from text prompts.
They use diffusion training, which involves training the AI with image-word associations and breaking down images until there is only random noise.
Both tools are already used by artists and graphic designers and “hobby creators”.

 

DALL-E website

Midjourney showcase website

 

Read More
2021
Jun 10

Google releases LaMDA, a conversational AI model

LaMDA a conversational AI model developed by Google. LaMDA is specifically designed to engage in natural and free-flowing conversations on a wide range of topics.You can essentially chat with it about anything — from morality and philosophy to movies and the latest news.

 

It aims to understand context and generate more nuanced responses compared to traditional language models. Google intends to use LaMDA in various applications, such as chatbots, virtual assistants, and customer service interactions, to provide more human-like interactions.

 

LaMDA stands for “Language Model for Dialogue Applications.”

 

 

Read More
Apr 16

Generative AI – the rapid development of a AI that can generate text, images, audio, video, etc. 2018 – now

Generative AI are artificial intelligence systems that are capable of generating new content, such as text, images, audio, or video, based on the patterns and relationships learned from training data. These systems use deep learning techniques, particularly generative models like variational autoencoders (VAEs), generative adversarial networks (GANs), and transformer-based language models, to create novel and realistic outputs.

 

 

Timeline of some significant milestones in the development of Generative AI, along with the companies and research groups involved:

 

2018: GPT (Generative Pre-trained Transformer) (OpenAI)
– OpenAI introduced GPT, a transformer-based language model that could generate human-like text by learning from a large corpus of online data.
– GPT demonstrated the potential of transformer models for natural language generation tasks.

 

2019: BigGAN (DeepMind, Google Brain)
– DeepMind and Google Brain researchers introduced BigGAN, a large-scale GAN model capable of generating high-resolution, diverse, and coherent images.
– BigGAN showed the scalability of GANs and their ability to generate high-quality images across various domains.

 

2020: DALL-E (OpenAI)
– OpenAI introduced DALL-E, a transformer-based model that could generate realistic images from text descriptions.
– DALL-E demonstrated the ability of generative models to understand and translate natural language into visual representations.

 

2021: Stable Diffusion (Runway)
– Runway released Stable Diffusion, an open-source text-to-image generative model based on latent diffusion.
– Stable Diffusion made high-quality image generation accessible to a wider audience and fueled the growth of AI art.

 

2022: DALL-E 2 (OpenAI), Imagen (Google Brain), and Parti (Google Brain)
– OpenAI introduced DALL-E 2, an advanced version of DALL-E with improved image generation capabilities.
– Google Brain released Imagen, a text-to-image model that could generate high-resolution images with unprecedented quality and detail.
– Google Brain also introduced Parti, a generative model that could create 3D objects from natural language descriptions.

 

2023: GPT-4 (OpenAI), Claude (Anthropic)
– OpenAI released GPT-4, a powerful language model with multimodal capabilities, including image and text generation.
– Anthropic introduced Claude, a conversational AI system capable of generating human-like responses and handling a wide range of tasks.

 

This timeline highlights the rapid progress in generative AI, driven by advancements in deep learning architectures, increasing computational power, and the availability of large datasets. 

 

Here are some applications of generative AI across different fields:

 

1. Art and Design:
– Generating realistic images, artwork, and designs from text descriptions or sketches
– Creating unique textures, patterns, and visual effects
– Exploring new artistic styles and creative expressions

 

2. Media and Entertainment:
– Generating realistic characters, environments, and animations for movies and video games
– Creating synthetic voice-overs, music, and sound effects
– Generating personalized stories, scripts, and dialogue

 

3. Marketing and Advertising:
– Generating personalized marketing content, such as product descriptions, advertisements, and social media posts
– Creating synthetic product images and visualizations
– Generating realistic virtual models and influencers for marketing campaigns

 

4. Scientific Research:
– Generating synthetic data for training and testing machine learning models
– Simulating complex physical or biological systems for experimentation
– Generating novel molecular structures for drug discovery and material design

 

5. Natural Language Processing:
– Generating human-like text for content creation, summarization, and translation
– Creating synthetic training data for language models and conversational AI
– Generating personalized responses and recommendations in virtual assistants

 

6. Computer Vision:
– Generating synthetic images for training and data augmentation
– Creating realistic virtual environments and scenarios for simulations
– Generating image-to-image translations, such as style transfer and image enhancement

 

7. Finance and Business:
– Generating synthetic financial data for risk modeling and scenario analysis
– Creating personalized reports, summaries, and business analyses
– Generating realistic simulations for training and decision-making

 

Read More
2020
Feb 01

Google’s AlphaFold solves the protein-folding problem

Google’s DeepMind AI made a leap in the field of AI with its system, AlphaFold, which is  a solution to the “protein-folding problem. For 50 years, scientists had been trying to predict how a protein would fold to help understand and treat diseases. AlphaFold did that in a short time. 

Proteins are the building blocks of life, and the way a protein folds determines its function; a mis-folded protein could cause disease.

 

Then, in 2022, Google shared 200 million of AlphaFold’s protein structures — covering almost every organism on the planet that has had its genome sequenced — freely with the scientific community via the AlphaFold Protein Structure Database.

 

More than 1 million researchers have already used it to work on everything from accelerating new malaria vaccines in record time to advancing cancer drug discovery and developing plastic-eating enzymes.

 

AlphaFold – accelerating research in nearly every field of biology

Read More
2019
Apr 15

Google Search understands your search queries better thanks to “BERT”

Rather than aiming to understand words individually, BERT algorithms helps Google understand words in context. This led to a huge quality improvement across Search, and made it easier for people to ask questions as they naturally would, rather than by stringing keywords together.

 

Google’s research on Transformers led to the introduction of Bidirectional Encoder Representations from Transformers, or BERT for short.

It helped Searchunderstand users’ queries better than ever before.

 

Understanding searches better than ever before

Read More
2017
Apr 22

Large Language Models – larger and larger LLMs leading to >> ChatGPT – 2017 onward

LLMs, or Large Language Models, are a class of artificial intelligence models capable of understanding and generating human-like text at scale. These models are typically based on deep learning architectures, such as transformers, and are trained on vast amounts of text data to learn the statistical patterns and structures of natural language.
LLMs have the ability to generate coherent and contextually relevant text across a wide range of tasks, including language translation, text summarization, question answering, and conversational agents. They have significantly advanced the field of natural language processing (NLP) and have enabled breakthroughs in various applications that require language understanding and generation.

 

Here’s a short timeline highlighting some key milestones in the development of LLMs in recent years:

 

Transformers (2017)
– The transformer architecture, introduced in the paper “Attention is All You Need,” marked a significant milestone in the development of LLMs. Transformers replaced recurrent neural networks (RNNs) in many NLP tasks and achieved state-of-the-art performance on various benchmarks.

 

BERT (2018)
– Bidirectional Encoder Representations from Transformers (BERT) introduced a pre-training strategy for LLMs that significantly improved their performance on downstream NLP tasks. BERT pre-training involved masking random words in input sentences and training the model to predict the missing words bidirectionally.

 

GPT-2 (2019)
– OpenAI released the GPT-2 model, which featured a large-scale transformer architecture with 1.5 billion parameters. GPT-2 demonstrated remarkable capabilities in generating coherent and contextually relevant text across a wide range of prompts and tasks.

 

GPT-3 (2020)
– OpenAI released GPT-3, the third iteration of their Generative Pre-trained Transformer model. With 175 billion parameters, GPT-3 represented a significant leap in scale and performance, showcasing the potential of LLMs for natural language understanding and generation.

 

GPT-4 (2023)
The newest model of LLMs  released by OpenAI.
Key features: 

More creative and informative: Compared to GPT-3, GPT-4 has improved capabilities in generating different creative text formats, writing different kinds of content, and collaborating on writing tasks. 

Multimodal abilities: Unlike its predecessors, GPT-4 can not only handle text but also analyze images. This allows for tasks like describing images, summarizing text in screenshots, and answering questions based on diagrams. 

Improved context: GPT-4 works with a larger context window compared to GPT-3, allowing it to handle longer conversations, analyze extended pieces of text, and create more coherent content. 

 

Various generative AI tools now exist, although text and image generation models are arguably the most well-known. Generative AI models typically rely on a user feeding a prompt into the engine that guides it towards producing some sort of desired output, be it text, an image, a video or a piece of music, though this isn’t always the case.

 

Examples of generative AI models include:

 

ChatGPT: An AI language model developed by OpenAI that can answer questions and generate human-like responses from text prompts.

Visit ChatGPT

 

 

DALL-E 3: Another AI model by OpenAI that can create images and artwork from text prompts.

Visit DALL-E 3

 

 

Google Gemini: Previously known as Bard, Gemini is Google’s generative AI chatbot and rival to ChatGPT. It’s trained on the PaLM large language model and can answer questions and generate text from prompts.

Visit Google Gemini

 

 

Claude 2.1Anthropic’s AI model, Claude, offers a 200,000 token context window, which its creators claim can handle more data than its competitors.

Visit Claude 2.1

 

 

Midjourney: Developed by San Francisco-based research lab Midjourney Inc., this gen AI model interprets text prompts to produce images and artwork, similar to DALL-E.

Visit Midjourney

 

 

GitHub CopilotAn AI-powered coding tool that suggests code completions within the Visual Studio, Neovim and JetBrains development environments.

Visit GitHub Copilot

 

 

Llama 2: Meta’s open-source large language model can be used to create conversational AI models for chatbots and virtual assistants, similar to GPT-4.

Visit Llama 2

 

 

Grok: After co-founding and helping to fund OpenAI, Elon Musk left the project in July 2023 and announced this new generative AI venture. Its first model, the irreverent Grok, came out in November 2023.

Visit Grok

 

 

Read More
Apr 14

Google Research introduces “The Transformer” neural network architecture

The Transformer has revolutionized what it means for machines to perform translation, text summarization, question answering and even image generation and robotics.

 

The Google Research paper “Attention Is All You Need” introduced the Transformer, a new neural network architecture that helped with language understanding.

 

Before the Transformer, machines were not very good at understanding the meaning of long sentences — they couldn’t see the relationships between words that were far apart.

 

The Transformer hugely improved this and has become the bedrock of today’s most impressive language understanding and generative AI systems.

 

Transformer: A Novel Neural Network Architecture for Language Understanding

Read More
2016
Apr 24

Google’s DeepMInd AlphaGo mastered the ancient game of Go

The AlphaGo Ai mastered the ancient game of Go, defeated a Go world champion, and inspired anew era of AI systems.

Deepmind AlphaGo

 

Read More
2010
Apr 22

Neural Networks, the fundamental architecture of AI – their rapid development from 2010 onwards

Artificial Neural Networks or Neural Nets are computational models – computer algorithms – inspired by the structure and function of the human brain. They consist of interconnected nodes, called neurons, organized in layers. Neural networks are capable of learning complex patterns in data and are widely used in various machine learning tasks.

 

Neural networks are trained using a process called deep learning, where they learn to perform tasks by adjusting the strengths of connections (weights) between neurons based on example inputs and corresponding outputs. Through this process, neural networks can learn complex patterns in data and make predictions or decisions based on new input.

 

if you really want to know how this works, watch this simple ;  ) video.

 

 

Research into Neural Networks has been going on since the 1980s. But because of the limited computing power and available data little progress was possible.

 

This changed dramatically later one especially from the 2010s onwards.
Here are some – more technical – details about the growing importance and use of Deep Neural Networks in different domains:

 

Neural Network Deep Learning Breakthroughs:
– The 2010s saw significant breakthroughs in deep learning research, with neural networks achieving state-of-the-art performance in various domains. This period witnessed the development of deep learning architectures such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer models.

 

Image Recognition and Computer Vision
– CNNs, particularly deep CNNs, became the standard architecture for image recognition and computer vision tasks and achieved remarkable performance on image classification, object detection, and image segmentation tasks.

 

Natural Language Processing (NLP)
– Transformer models, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), revolutionized natural language processing tasks. These models surpassed previous benchmarks in tasks like language understanding, question answering, text generation, and machine translation.

 

Transfer Learning and Pre-training
– Transfer learning and pre-training became dominant paradigms in deep learning, where models are first trained on large-scale datasets and then fine-tuned for specific tasks. Pre-trained models, such as those released by OpenAI (GPT, GPT-2, GPT-3) and Google (BERT), demonstrated superior performance and generalization across diverse NLP tasks.

 

Reinforcement Learning Advances
– Reinforcement learning (RL) algorithms, particularly deep reinforcement learning (DRL), made significant strides in solving complex sequential decision-making tasks. Deep Q-Networks (DQN), Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and AlphaGo demonstrated impressive results in games, robotics, and autonomous systems.

 

Autonomous Vehicles and Robotics
– Neural networks played a crucial role in the development of autonomous vehicles and robotics. Deep learning models enabled advancements in perception, localization, mapping, and control, leading to autonomous driving systems deployed by companies like Tesla, Waymo, and Uber, as well as robotic applications in manufacturing, healthcare, and logistics.

 

Healthcare and Biomedical Applications
– Neural networks were increasingly applied in healthcare and biomedical research for tasks such as medical image analysis, disease diagnosis, drug discovery, and personalized medicine. Deep learning models demonstrated promising results in detecting diseases from medical images, predicting patient outcomes, and analyzing genomic data.

 

Ethical and Societal Implications
– The rapid advancement of neural networks raised concerns about their ethical and societal implications. Debates around issues like bias in AI systems, privacy concerns, algorithmic fairness, and responsible AI research gained prominence, leading to efforts to address these challenges through interdisciplinary collaborations and policy interventions.

 

Read More
2006
Apr 06

Google Translate

Google Translate launched in 2006 and used machine learning to automatically translate languages. It started with Arabic to English and English to Arabic translations.

 

Today Google Translate supports 133 languages spoken by millions of people around the world.

 

This technology can translate text, images or even a conversation in real time, breaking down language barriers across the global community, helping people communicate and expanding access to information like never before.

 

https://translate.google.com/

 

 

Read More
2001
Apr 15

Machine learning helps Google Search users correct their spelling

In 2001, Google introduced machine learning to enhance its search engine capabilities, particularly in correcting users’ spelling errors.

 

This innovation marked a significant advancement in improving the accuracy and relevance of search results. By leveraging machine learning algorithms, Google was able to analyze patterns in user queries and suggest corrections for misspelled words, ultimately leading to a more seamless and effective search experience for users.

 

This integration of machine learning into Google Search exemplifies the continuous evolution of technology to better serve user needs and enhance overall user satisfaction.

Read More
1974
Apr 18

AI Winter ca 1974 – 1980 and ca 1987 – 2000

The AI winter refers to periods of time when enthusiasm and funding for Artificial Intelligence research and development declined significantly, often due to unmet expectations or failures to deliver on the promised capabilities of AI technologies. The term was first coined in the 1980s when AI research faced a downturn after initial hype in the 1950s and 1960s.

During AI winters, funding for AI projects decreases, leading to a slowdown in research and development efforts. This can result from factors such as overhyped expectations, technological limitations, or failures to achieve significant breakthroughs. 

 

Read More
1966
Apr 18

ELIZA – the first-ever chatpot

Long before Siri and Alexa an ChatGPT & Co, there was ELIZA, the grandmother of all chatpots :-D. 

 

In 1966, MIT researcher Joseph Weizenbaum created Eliza, the first chatbot, which marked the beginning of research into Natural Language Processing (NLP).  Eliza used pattern recognition and responded to programmed trigger-words to simulate human conversation. (Hint: Eliza was not super intelligent 😉

 

ELIZA simulated a conversational interaction somehow similar to what might take place in the office of a psychotherapist in an initial psychiatric interview.

Here is a recreation of the original program

 

Enjoy your free therapy session ;  )

 

 

 

Read More
1956
Apr 17

The Dartmouth Summer Research Project on Artificial Intelligence

Held in the summer of 1956, the “Dartmouth Summer Research Project on Artificial Intelligence” brought together some of the brightest minds in computing and cognitive science.The Dartmouth Conference is considered to be the “Foundation Stone” of the Field of Artificial Intelligence (AI).

 

 

The project lasted approximately six to eight weeks and was essentially an extended brainstorming session.

 

Those were some of the invited “geniuses” :

John McCarthy

Dr. Marvin Minsky

Dr. Julian Bigelow

Professor D.M. Mackay

Mr. Ray Solomonoff

Mr. John Holland

Dr. John McCarthy

Dr. Claude Shannon

Mr. Nathaniel Rochester

Mr. Oliver Selfridge

Dr. Allen Newell

Professor Herbert Simon

 

 

Obviously, no woman was invited.
Welcome to the 1950s!

Read More
1953
Apr 17

First use of the term “Artificial Intelligence”


John McCarthy, a American computer scientist and cognitive scientist held a workshop on “artificial intelligence” which is the first use of the word, and how it came into popular usage.

He is considered to be one of the “founding fathers” of Artificial Intelligence.

Read More