AI Tools and Techniques

Developing education-AI often requires combining tools and techniques. This section helps developers understand what tools and techniques are emerging, how they can apply to education, and links to more information to start to create solutions to deliver on the ideas.

This list of AI tools is based on the definitions developed by Hugging Face and Scikit Learn.

AI Tools

Here is the complete list of the AI tools and techniques that might be used to bring the ideas to reality

Tags Definition Education example Link for more information
ML: Classification Identifying which category an object belongs to. Support Teaching at the Right Level (TaRL) by classifying students by topics they struggle with and grouping them for support. Scikit-Learn: Classification
ML: Regression Predicting a continuous-valued attribute associated with an object. Enable better understanding of factors predicting success of education intervention. Scikit-Learn: Regression
ML: Clustering Automatic grouping of similar objects into sets. Grouping students in large classes based on formative assessments so teachers can give targeted support. Scikit-Learn: Clustering
ML: Dimensionality Reduction Reducing the number of random variables to consider. Transforming data prior to analysis to better enable classification, clustering or regression analysis. Scikit-Learn: Dimensionality Reduction 
ML: Model Comparing, validating, and choosing parameters and models. N/A Scikit-Learn: Model
ML: Pre-processing Feature extraction and normalization. N/A Scikit-Learn: Pre-processing
CV: Depth Depth estimation is the task of predicting depth of the objects present in an image. Using depth estimation to enable AR apps to identify real world objects so students can interact with them. Huggingface: Depth 
CV: Image Classification Image classification is the task of assigning a label or class to an entire image. Images are expected to have only one class for each image. Classify images of a textbook so it can be automatically described to a partially sighted student. Huggingface: Image Classification 
CV: Segmentation Image Segmentation divides an image into segments where each pixel in the image is mapped to an object.

Segmentation is crucial in creating augmented reality tools to enhance the properties of existing learning resources.

Huggingface: Segmentation 
CV: Image2Image Image-to-image is the task of transforming a source image to match the characteristics of a target image or a target image domain. AI can generate new, culturally relevant images based on less appropriate images in printed teaching resources. Huggingface: Image2Image 
CV: Object Object Detection models allow users to identify objects of certain defined classes. Recognise faces or animals in an image to generate a task for students that explores their understanding of that image.   Huggingface: Object 
CV: Mask Mask Generation in AI refers to the process of generating a mask that can be used to remove or replace certain parts of an image. N/A Huggingface: Mask 
CV: Video Classification Video classification is the task of assigning a label or class to an entire video. N/A Huggingface: Video Classification 
CV: Image Generation Unconditional image generation is the task of generating images with no condition in any context (like a prompt text or another image). N/A Huggingface: Image Generation 
CV: Zero Shot Image Zero shot image classification is the task of classifying previously unseen classes during training of a model. N/A Huggingface: Zero Shot Image 
CV: Zero Shot Object Zero-Shot Object Detection is a technique that identifies objects in images based on free-text queries without the need for labelled datasets N/A Huggingface: Zero Shot Object
NLP: Conversational Conversational response modelling is the task of generating conversational text that is relevant, coherent and knowledgeable given a prompt. AI can be used as a talking partner for a teacher to develop specific knowledge or skills. Huggingface: Conversational 
NLP: Fill Mask Masked language modelling is the task of masking some of the words in a sentence and predicting which words should replace those masks. N/A Huggingface: Fill Mask 
NLP: Question Question Answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document. AI can respond to questions by students or teachers related to subject content matter. Huggingface: Question 
NLP: Sentence Similarity Sentence similarity is the task of determining how similar two texts are. Sentence similarity models convert input texts into vectors (embeddings) that capture semantic information and calculate how close (similar) they are between them. N/A Huggingface: Sentence Similarity 
NLP: Summarisation Summarisation is the task of producing a shorter version of a document while preserving its important information. Summarise longer texts to give teachers key points when planning lessons. Huggingface: Summarisation 
NLP: Table Question Answering Table Question Answering (Table QA) is the answering a question about an information on a given table. N/A Huggingface: Table Question Answering 
NLP: Text Classification Text Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness. Check alignment between school resources and national curriculum to ensure coverage. Huggingface: Text Classification 
NLP: Generation Generating text is the task of producing new text. These models can, for example, fill in incomplete text or paraphrase. Generate revision questions for students aligning to the curriculum to help their examination preparation. Huggingface: Generation 
NLP: Token Classification Token classification is a natural language understanding task in which a label is assigned to some tokens in a text. N/A Huggingface: Token Classification 
NLP: Translation Translation is the task of converting text from one language to another. Translate resources from different contexts or cultures into local languages. Huggingface: Translation
NLP: Zero Shot Classification Zero-shot text classification is a task in natural language processing where a model is trained on a set of labelled examples but is then able to classify new examples from previously unseen classes. N/A Huggingface: Zero Shot Classification 
Audio: Classification Audio classification is the task of assigning a label or class to a given audio. It can be used for recognizing which command a user is giving or the emotion of a statement, as well as identifying a speaker.

AI can analyse recorded group work to generate a record of the curriculum areas covered.

Huggingface: Classification 
Audio: ASR Automatic Speech Recognition (ASR), also known as Speech to Text (STT), is the task of transcribing a given audio to text. It has many applications, such as voice user interfaces. Analyse voice data such as student responses in verbal reading assessments. Huggingface: ASR
Audio: Audio2Audio Audio-to-Audio is a family of tasks in which the input is an audio and the output is one or multiple generated audios. Some example tasks are speech enhancement and source separation. N/A Huggingface: Audio2Audio 
Audio: Text2Speech Text-to-Speech (TTS) is the task of generating natural sounding speech given text input. TTS models can be extended to have a single model that generates speech for multiple speakers and multiple languages. Chatbots can respond to student or teacher’s queries using voice. Huggingface: Text2Speech
Tabular: Classification Tabular classification is the task of classifying a target category (a group) based on set of attributes. N/A Huggingface: Classification
Tabular: Regression Tabular regression is the task of predicting a numerical value given a set of attributes. N/A Huggingface: Regression
MM: Doc Questions Document Question Answering (also known as Document Visual Question Answering) is the task of answering questions on document images. . N/A Huggingface: Doc Questions 
MM: Feature Feature extraction refers to the process of transforming raw data into numerical features that can be processed while preserving the information in the original dataset. N/A Huggingface: Feature 
MM: Image2Text Image to text models output a text from a given image. Image captioning or optical character recognition can be considered as the most common applications of image to text. Recognise student handwriting in written assessments and give feedback on their answers. Huggingface: Image2Text 
MM: Text2Image Generates images from input text. These models can be used to generate and modify images based on text prompts. AI can generate images relating to specific pages in textbooks or other resources to better illustrate what is being taught. Huggingface: Text2Image
MM: Text2Video Text-to-video models can be used in any application that requires generating consistent sequence of images from text. AI can convert text-based learning resources to video formats to improve accessibility. Huggingface: Text2Video 
MM: Visual Visual Question Answering is the task of answering open-ended questions based on an image. They output natural language responses to natural language questions. N/A Huggingface: Visual 
MM: Text3D Text-to-3D turns written words into a 3D interactive experience with images, sounds and other elements. N/A Huggingface: Text3D 
MM: Image3D Image-to-3D converts regular images into 3D models, allowing for immersive and realistic visual experiences. N/A Huggingface: Images3D 
RL: Reinforcement Reinforcement learning is the computational approach of learning from action by interacting with an environment through trial and error and receiving rewards (negative or positive) as feedback N/A Huggingface: Reinforcement 
Large Language Models Large Language Models are the key component behind text generation. They consist of large pretrained transformer models trained to predict the next word (or, more precisely, token) given some input text. A text-based teacher chatbot that responds to teachers by responding to voice or text questions. Huggingface: Large Language Models 
Large Multi-Modal Models A transformer model used in multimodal settings, combining a text and an image to make predictions. AI can analyse resources with both image and text (such as textbooks) and identify gaps in coverage or alignment. Huggingface: Large Multi-Modal Models  

Abbreviation Key

ML: Machine Learning, CV: Computer Vision, NLP: Natural Language Processing, MM: Multi-Modal, RL: Reinforcement Learning

Where Can AI Help

Go back to the diagram to browse where AI can be used accross the education system.

Sign Up

Join our mailing list to keep up to date with news and events.

Community          Knowledge          FAQ
          Privacy Policy

AI-for-Education.org was set up by Fab Inc. in partnership with Team4Tech. We are grateful to the Bill & Melinda Gates Foundation and the Jacobs Foundation for their support.

Powered by FabData.IO