GenAI Learning

GenAI Learning - Getting Started

GenAI Initial Demo
 1Generative AI - Tokens, Chunking, embeddings etc..
 2
 3GenAI is transforming industries - It powers applications such as text generation, image creation etc...
 4
 5It is a Pre-trained Language Model.
 6
 7GenAI generates new content based on pre-trained modela...
 8IT does prettymuch everything... text,images,audio,vedio and even code etc..
 9
10genAI comprises of
11  1. Models
12  2. Transformers
13  3. Prompt Engineering
14  4. Inference
15  5. Context Window
16  6. Token
17  7. Vector
18  8. Embeddings
19  9. Chunking
20  10. Multi-Mode Models
21  11. Diffusion Models
22
23
24Transformer Network:
25------------------------>
26Context Window:
27Foundational Models  to Models like GPT and BERT
28
29Tokens and Tokenization:
30The process of breaking down input into Tokens
31
32Embeddings and Vectors:
33
34Chunking:
35
36
37Prompt Engineering:
38Designing prompts to get the desired output. Techniques includes zero-shot, one-shot and few-shot learning
39
40Zero-shot learning
41one-shot learning
42few-shot lerning
43
44
45Multimodal Models:
46
47Diffusion models
48
49
50
51
52Github Copilot
53
54
55AWS Generative AI hosting system is AWS Bedrock
56
57Architectures behind the genAI
581. Generative Adversarial Network(GAN)
592. Variational Autoencoder(VAE)
603. Transformers

Foundation Model Lifecycle

Foundation Model Lifecycle Script
 1Data Selection
 2Model Selection
 3Pre-Training
 4Fine-Tuning
 5Evaluating
 6Deployment
 7Feedback & Monitoring
 8Iteration and Optimization
 9
10Prompt Engineering
11
12User Prompt
13System prompts
14
15AWS INfra for genAI Apps:
16--------------------------------->
17SageMaker Jump start
18
19PartyRock  - Playground for testing genAI apps.
20
21Optimizing AI with Vector Databases,
22
23AWS Bedrock to host genAI Models
24
25AWS Bedrock for Generative AI development

Foundational Models & Applications

Foundational Models Applications
11. Convolutional Neural Network
22. Recurrent Neural Network
3
4AI Model Performance Metrics:
5------------------------------------>
61. Accuracy
72. Precision
83. Recall
94. F1 Score

Choose Model Based on Metrics

_images/Choose_model_basedon_metrics.png

Retrieval Augmented Generation (RAG)

_images/Retrieval_Augmented_Generation.png

How RAG Works

_images/how_RAG_works.png

RAG Shell Script

Retrieval Augmented Generation Script
 1Amazon Bedrock uses RAGs to enhance foundational model performance in customer applications.
 2
 3Amazon Bedrock and Amazon Kendra USe Vector database to enhance foundation model performance in semantic search and document retrieval.
 4
 5
 6Selecting the Pre-trained Models:
 7
 8RAG(Retrieval Augmented Generation) is a method that basically combines two components i.e.
 9Large language model generation and information Retrieval.
10
11
12Basically it take Large Language model and it retrieves information.
13
14So the goal of RAG is to basically retrieve that information and ingested into the model.
15
16
17Vector Databases: the Backbone of RAG
18-------------------------------------------------->
19metadata about images
20metadata about audio
21metadata about videos
22
23Amazon Bedrock: RAG in action
24
25Amazon Bedrock leverages RAG to enhance language models, Retrieves data from knowledge bases to improve responses.

Vector Databases

Vector Databases Overview
 1Vector Databases store data as embeddings, which are numerical representations of data like text and images.
 2
 3These embeddings allow fast, efficient and semantically relevant searches for AI and machine learning tasks.
 4
 5Several AWS Services help store and manage embeddings in vector databases.
 6
 7
 8Amazon OpenSearch service for Generative AI.
 9k-Nearest Neighbors(k-NN) for the efficient queries.
10
11
12Amazon Aurora PostgreSQL-Compatible Edition and Amazon RDS for PostgreSQL support pgvector
131. pgvector extension available on Amazon Aurora and Amazon RDS for PostgreSQL.
142. Enables storage and similarity searches using ML-generated embeddings
153. Embeddings capture semantic meaning from text processed by large language models(LLMs)
16
17Amazon Neptune ML:
18-------------------------->
19Uses Graph Neural Networks(GNNs) to enhance predictions using complex graph relationships.
20
21Vector Search for amazon memoryDB
22
23Vector search by Amazon DocumentDB(with MongoDB compatibility)
24
25RAG with Amazon Bedrock and Custom Knowledge bases.
26
27RAG combines retrieved data with generative models.
28Amazon Bedrock supports RAG by integrating with custom knowledge bases.

Foundation Model Customizations

_images/foundations_model_customizations.png

Amazon Bedrock Agents

_images/Amazon_Bedrock_Agents.png

Foundational Models

Foundational Models Script
1Retrieval Augmented Generation -- It combines retrievd data with Generative models.
2
3AGents for multi-step tasks:
4
5Amazon Bedrock agents.. that can help to do multi-step workflows, complex workflows,
6
7Multimodal agents

Prompt Engineering

Prompt Engineering Practices
 1Prompt engineering provides certain inputs to model and tells it what to do with the user inputs.
 2
 3Prompt Templates
 4Negative Prompts
 5Context in prompts
 6
 7
 8Pre-Training and Fine-tuning Foundational Models
 9
10PEFT - parameter Efficient Fine-Tuning
11LoRA - Low-Rank Adaption
12ReFT - Reperesentaion Fine-Tunning
13
14Multitask Fine-Tuning

Model Performance Evaluation

Foundational Model Performance Script
 1Jupyter Notebook
 2
 3
 4Big-Bench
 5
 6Holistic Evaluation of Language models
 7
 8Amazon SageMaker CLarify
 9
10Amazon Bedrock and BERTScore
11
12Context-learning
13
14fine-tuning

Responsible AI Practices

Responsible AI Considerations
 1Healthcare
 2Financial
 3Law Firms
 4
 5Governance
 6Security
 7robustness
 8explainability
 9fairness in AI
10
11Tools for identifying responsible AI:
12----------------------------------------------->
131. Amazon SageMaker Clarify  -- Bias Detection, Model Decisions with SageMaker Clarify
142. Amazon Bedrock for Guardrails
153. Environmental impact in assessment of AI
16
17
18Data PRivacy and security risks:
19
20
21Balaned datasets
22
23AMazon Sagemaker clarify
24
25SageMaker Data Wrangler
26
27Data Preprocessing:
28------------------------------>
291. Data CLeaning
302. Normalization
313. Feature Selection
32
33REgualr auditing and fairness
34
35Transparency and explainable ai models
36----------------------------------------->
37
38Human-Centered AI
39
40A2I - Amazon Augmented AI
41
42Reinforcement Learning from Human Feedback (RLHF)
43
44Human Centered Design(HCD)

Security, Compliance & Governance

Security and Governance in GenAI
 1security
 2
 3SageMaker notebook instance in a private subnet
 4
 5SageMaker Distributed Training - Inter-mode-encryption
 6
 7Amazon SageMaker Security
 8
 9
10Data Source --> Data Processin --> Data Storage
11
12SageMaker Model Registry for Model versioning
13SageMaker Model cards  - Documentong model deails
14
15SageMaker Feature Store
16
17AWS Artifact: Simplifying Compliance Reporting
18AWS Glue DataBrew: Data Preparation for Governance
19AWS Lake Foramtion
20Amazon S3
21Amazon SageMaker Clarify
22AWS Config - Continuous Monitoring for Compliance
23AWS Inspector - Security and Compliance Assessment
24AWS Audit Manager - Streamlined compliance Auditing
25AWS Cloudtrail - all API calls
26AWS Trusted Advisor - Best practices and compliance recommendations
27
28
29Data Governance Strategies:
30
31Data Lifecycle management
32- s3 lifecycle management
33
34Data logging -- AWS Cloud Trail, Amazon cloudwatch
35
36Data Curation and understanding -- AWS Glue DataBrew
37
38Master DAta Management  using Amazon Redshift and AWS Glue