MLA-C01 Sample Questions Answers

Questions 4

An ML engineer needs to deploy ML models to get inferences from large datasets in an asynchronous manner. The ML engineer also needs to implement scheduled monitoring of data quality for the models and must receive alerts when changes in data quality occur.

Which solution will meet these requirements?

Options:

Deploy the models by using scheduled AWS Glue jobs. Use Amazon CloudWatch alarms to monitor the data quality and send alerts.

Deploy the models by using scheduled AWS Batch jobs. Use AWS CloudTrail to monitor the data quality and send alerts.

Deploy the models by using Amazon ECS on AWS Fargate. Use Amazon EventBridge to monitor the data quality and send alerts.

Deploy the models by using Amazon SageMaker AI batch transform. Use SageMaker Model Monitor to monitor the data quality and send alerts.

Buy Now

Questions 5

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company is experimenting with consecutive training jobs.

How can the company MINIMIZE infrastructure startup times for these jobs?

Options:

Use Managed Spot Training.

Use SageMaker managed warm pools.

Use SageMaker Training Compiler.

Use the SageMaker distributed data parallelism (SMDDP) library.

Buy Now

Questions 6

A company has historical data that shows whether customers needed long-term support from company staff. The company needs to develop an ML model to predict whether new customers will require long-term support.

Which modeling approach should the company use to meet this requirement?

Options:

Anomaly detection

Linear regression

Logistic regression

Semantic segmentation

Buy Now

Questions 7

A company has deployed an ML model that detects fraudulent credit card transactions in real time in a banking application. The model uses Amazon SageMaker Asynchronous Inference. Consumers are reporting delays in receiving the inference results.

An ML engineer needs to implement a solution to improve the inference performance. The solution also must provide a notification when a deviation in model quality occurs.

Which solution will meet these requirements?

Options:

Use SageMaker real-time inference for inference. Use SageMaker Model Monitor for notifications about model quality.

Use SageMaker batch transform for inference. Use SageMaker Model Monitor for notifications about model quality.

Use SageMaker Serverless Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.

Keep using SageMaker Asynchronous Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.

Buy Now

Questions 8

A company wants to predict the success of advertising campaigns by considering the color scheme of each advertisement. An ML engineer is preparing data for a neural network model. The dataset includes color information as categorical data.

Which technique for feature engineering should the ML engineer use for the model?

Options:

Apply label encoding to the color categories. Automatically assign each color a unique integer.

Implement padding to ensure that all color feature vectors have the same length.

Perform dimensionality reduction on the color categories.

One-hot encode the color categories to transform the color scheme feature into a binary matrix.

Buy Now

Questions 9

A company wants to host an ML model on Amazon SageMaker. An ML engineer is configuring a continuous integration and continuous delivery (Cl/CD) pipeline in AWS CodePipeline to deploy the model. The pipeline must run automatically when new training data for the model is uploaded to an Amazon S3 bucket.

Select and order the pipeline's correct steps from the following list. Each step should be selected one time or not at all. (Select and order three.)

• An S3 event notification invokes the pipeline when new data is uploaded.

• S3 Lifecycle rule invokes the pipeline when new data is uploaded.

• SageMaker retrains the model by using the data in the S3 bucket.

• The pipeline deploys the model to a SageMaker endpoint.

• The pipeline deploys the model to SageMaker Model Registry.

Options:

Buy Now

Questions 10

A digital media entertainment company needs real-time video content moderation to ensure compliance during live streaming events.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Use Amazon Rekognition and AWS Lambda to extract and analyze the metadata from the videos' image frames.

Use Amazon Rekognition and a large language model (LLM) hosted on Amazon Bedrock to extract and analyze the metadata from the videos’ image frames.

Use Amazon SageMaker AI to extract and analyze the metadata from the videos' image frames.

Use Amazon Transcribe and Amazon Comprehend to extract and analyze the metadata from the videos' image frames.

Buy Now

Questions 11

A company is using an Amazon S3 bucket to collect data that will be used for ML workflows. The company needs to use AWS Glue DataBrew to clean and normalize the data.

Which solution will meet these requirements?

Options:

Create a DataBrew dataset by using the S3 path. Clean and normalize the data by using a DataBrew profile job.

Create a DataBrew dataset by using the S3 path. Clean and normalize the data by using a DataBrew recipe job.

Create a DataBrew dataset by using a JDBC driver to connect to the S3 bucket. Use a profile job.

Create a DataBrew dataset by using a JDBC driver to connect to the S3 bucket. Use a recipe job.

Buy Now

Questions 12

An ML engineer normalized training data by using min-max normalization in AWS Glue DataBrew. The ML engineer must normalize the production inference data in the same way as the training data before passing the production inference data to the model for predictions.

Which solution will meet this requirement?

Options:

Apply statistics from a well-known dataset to normalize the production samples.

Keep the min-max normalization statistics from the training set. Use these values to normalize the production samples.

Calculate a new set of min-max normalization statistics from a batch of production samples. Use these values to normalize all the production samples.

Calculate a new set of min-max normalization statistics from each production sample. Use these values to normalize all the production samples.

Buy Now

Questions 13

A company is running ML models on premises by using custom Python scripts and proprietary datasets. The company is using PyTorch. The model building requires unique domain knowledge. The company needs to move the models to AWS.

Which solution will meet these requirements with the LEAST effort?

Options:

Use SageMaker built-in algorithms to train the proprietary datasets.

Use SageMaker script mode and premade images for ML frameworks.

Build a container on AWS that includes custom packages and a choice of ML frameworks.

Purchase similar production models through AWS Marketplace.

Buy Now

Questions 14

A company launches a feature that predicts home prices. An ML engineer trained a regression model using the SageMaker AI XGBoost algorithm. The model performs well on training data but underperforms on real-world validation data.

Which solution will improve the validation score with the LEAST implementation effort?

Options:

Create a larger training dataset with more real-world data and retrain.

Increase the num_round hyperparameter.

Change the eval_metric from RMSE to Error.

Increase the lambda hyperparameter.

Buy Now

Questions 15

An ML engineer is training an XGBoost regression model in Amazon SageMaker AI. The ML engineer conducts several rounds of hyperparameter tuning with random grid search. After these rounds of tuning, the error rate on the test hold-out dataset is much larger than the error rate on the training dataset.

The ML engineer needs to make changes before running the hyperparameter grid search again.

Which changes will improve the model's performance? (Select TWO.)

Options:

Increase the model complexity by increasing the number of features in the dataset.

Decrease the model complexity by reducing the number of features in the dataset.

Decrease the model complexity by reducing the number of samples in the dataset.

Increase the value of the L2 regularization parameter.

Decrease the value of the L2 regularization parameter.

Buy Now

Questions 16

A company wants to use large language models (LLMs) supported by Amazon Bedrock to develop a chat interface for internal technical documentation.

The documentation consists of dozens of text files totaling several megabytes and is updated frequently.

Which solution will meet these requirements MOST cost-effectively?

Options:

Train a new LLM in Amazon Bedrock using the documentation.

Use Amazon Bedrock guardrails to integrate documentation.

Fine-tune an LLM in Amazon Bedrock with the documentation.

Upload the documentation to an Amazon Bedrock knowledge base and use it as context during inference.

Buy Now

Questions 17

A company uses a hybrid cloud environment. A model that is deployed on premises uses data in Amazon 53 to provide customers with a live conversational engine.

The model is using sensitive data. An ML engineer needs to implement a solution to identify and remove the sensitive data.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Deploy the model on Amazon SageMaker. Create a set of AWS Lambda functions to identify and remove the sensitive data.

Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate. Create an AWS Batch job to identify and remove the sensitive data.

Use Amazon Macie to identify the sensitive data. Create a set of AWS Lambda functions to remove the sensitive data.

Use Amazon Comprehend to identify the sensitive data. Launch Amazon EC2 instances to remove the sensitive data.

Buy Now

Questions 18

An ML engineer needs to use AWS services to identify and extract meaningful unique keywords from documents.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

Use the Natural Language Toolkit (NLTK) library on Amazon EC2 instances for text pre-processing. Use the Latent Dirichlet Allocation (LDA) algorithm to identify and extract relevant keywords.

Use Amazon SageMaker and the BlazingText algorithm. Apply custom pre-processing steps for stemming and removal of stop words. Calculate term frequency-inverse document frequency (TF-IDF) scores to identify and extract relevant keywords.

Store the documents in an Amazon S3 bucket. Create AWS Lambda functions to process the documents and to run Python scripts for stemming and removal of stop words. Use bigram and trigram techniques to identify and extract relevant keywords.

Use Amazon Comprehend custom entity recognition and key phrase extraction to identify and extract relevant keywords.

Buy Now

Questions 19

A company has an existing Amazon SageMaker AI model (v1) on a production endpoint. The company develops a new model version (v2) and needs to test v2 in production before substituting v2 for v1.

The company needs to minimize the risk of v2 generating incorrect output in production and must prevent any disruption of production traffic during the change.

Which solution will meet these requirements?

Options:

Create a second production variant for v2. Assign 1% of the traffic to v2 and 99% to v1. Collect all output of v2 in Amazon S3. If v2 performs as expected, switch all traffic to v2.

Create a second production variant for v2. Assign 10% of the traffic to v2 and 90% to v1. Collect all output of v2 in Amazon S3. If v2 performs as expected, switch all traffic to v2.

Deploy v2 to a new endpoint. Turn on data capture for the production endpoint. Send 100% of the input data to v2.

Deploy v2 into a shadow variant that samples 100% of the inference requests. Collect all output in Amazon S3. If v2 performs as expected, promote v2 to production.

Buy Now

Questions 20

A company needs to analyze a large dataset that is stored in Amazon S3 in Apache Parquet format. The company wants to use one-hot encoding for some of the columns.

The company needs a no-code solution to transform the data. The solution must store the transformed data back to the same S3 bucket for model training.

Which solution will meet these requirements?

Options:

Configure an AWS Glue DataBrew project that connects to the data. Use the DataBrew interactive interface to create a recipe that performs the one-hot encoding transformation. Create a job to apply the transformation and write the output back to an S3 bucket.

Use Amazon Athena SQL queries to perform the one-hot encoding transformation.

Use an AWS Glue ETL interactive notebook to perform the transformation.

Use Amazon Redshift Spectrum to perform the transformation.

Buy Now

Questions 21

A company has built more than 50 models and deployed the models on Amazon SageMaker Al as real-time inference

endpoints. The company needs to reduce the costs of the SageMaker Al inference endpoints. The company used the same

ML framework to build the models. The company's customers require low-latency access to the models.

Select and order the correct steps from the following list to reduce the cost of inference and keep latency low. Select each

step one time or not at all. (Select and order FIVE.)

· Create an endpoint configuration that references a multi-model container.

. Create a SageMaker Al model with multi-model endpoints enabled.

. Deploy a real-time inference endpoint by using the endpoint configuration.

. Deploy a serverless inference endpoint configuration by using the endpoint configuration.

· Spread the existing models to multiple different Amazon S3 bucket paths.

. Upload the existing models to the same Amazon S3 bucket path.

. Update the models to use the new endpoint ID. Pass the model IDs to the new endpoint.

Options:

Buy Now

Questions 22

A company is creating an ML model to identify defects in a product. The company has gathered a dataset and has stored the dataset in TIFF format in Amazon S3. The dataset contains 200 images in which the most common defects are visible. The dataset also contains 1,800 images in which there is no defect visible.

An ML engineer trains the model and notices poor performance in some classes. The ML engineer identifies a class imbalance problem in the dataset.

What should the ML engineer do to solve this problem?

Options:

Use a few hundred images and Amazon Rekognition Custom Labels to train a new model.

Undersample the 200 images in which the most common defects are visible.

Oversample the 200 images in which the most common defects are visible.

Use all 2,000 images and Amazon Rekognition Custom Labels to train a new model.

Buy Now

Questions 23

An ML engineer needs to deploy ML models to get inferences from large datasets in an asynchronous manner. The ML engineer also needs to implement scheduled monitoring of the data quality of the models. The ML engineer must receive alerts when changes in data quality occur.

Which solution will meet these requirements?

Options:

Deploy the models by using scheduled AWS Glue jobs. Use Amazon CloudWatch alarms to monitor the data quality and to send alerts.

Deploy the models by using scheduled AWS Batch jobs. Use AWS CloudTrail to monitor the data quality and to send alerts.

Deploy the models by using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate. Use Amazon EventBridge to monitor the data quality and to send alerts.

Deploy the models by using Amazon SageMaker AI batch transform. Use SageMaker Model Monitor to monitor the data quality and to send alerts.

Buy Now

Questions 24

A company wants to use Amazon SageMaker AI to host an ML model that runs on CPU for real-time predictions. The model has intermittent traffic during business hours and periods of no traffic after business hours.

Which hosting option will serve inference requests in the MOST cost-effective manner?

Options:

Deploy the model to a real-time endpoint with scheduled auto scaling.

Deploy the model to a SageMaker AI Serverless Inference endpoint with provisioned concurrency during business hours.

Deploy the model to an asynchronous inference endpoint with auto scaling to zero.

Deploy the model to a real-time endpoint and activate it only during business hours using AWS Lambda.

Buy Now

Questions 25

An ML engineer is using AWS CodeDeploy to deploy new container versions for inference on Amazon ECS.

The deployment must shift 10% of traffic initially, and the remaining 90% must shift within 10–15 minutes.

Which deployment configuration meets these requirements?

Options:

CodeDeployDefault.LambdaLinear10PercentEvery10Minutes

CodeDeployDefault.ECSAllAtOnce

CodeDeployDefault.ECSCanary10Percent15Minutes

CodeDeployDefault.LambdaCanary10Percent15Minutes

Buy Now

Questions 26

An ML engineer is training an ML model to identify medical patients for disease screening. The tabular dataset for training contains 50,000 patient records: 1,000 with the disease and 49,000 without the disease.

The ML engineer splits the dataset into a training dataset, a validation dataset, and a test dataset.

What should the ML engineer do to transform the data and make the data suitable for training?

Options:

Apply principal component analysis (PCA) to oversample the minority class in the training dataset.

Apply Synthetic Minority Oversampling Technique (SMOTE) to generate new synthetic samples of the minority class in the training dataset.

Randomly oversample the majority class in the validation dataset.

Apply k-means clustering to undersample the minority class in the test dataset.

Buy Now

Questions 27

A company is using an AWS Lambda function to monitor the metrics from an ML model. An ML engineer needs to implement a solution to send an email message when the metrics breach a threshold.

Which solution will meet this requirement?

Options:

Log the metrics from the Lambda function to AWS CloudTrail. Configure a CloudTrail trail to send the email message.

Log the metrics from the Lambda function to Amazon CloudFront. Configure an Amazon CloudWatch alarm to send the email message.

Log the metrics from the Lambda function to Amazon CloudWatch. Configure a CloudWatch alarm to send the email message.

Log the metrics from the Lambda function to Amazon CloudWatch. Configure an Amazon CloudFront rule to send the email message.

Buy Now

Questions 28

An ML model is deployed in production. The model has performed well and has met its metric thresholds for months.

An ML engineer who is monitoring the model observes a sudden degradation. The performance metrics of the model are now below the thresholds.

What could be the cause of the performance degradation?

Options:

Lack of training data

Drift in production data distribution

Compute resource constraints

Model overfitting

Buy Now

Questions 29

An ML engineer has an Amazon Comprehend custom model in Account A in the us-east-1 Region. The ML engineer needs to copy the model to Account В in the same Region.

Which solution will meet this requirement with the LEAST development effort?

Options:

Use Amazon S3 to make a copy of the model. Transfer the copy to Account B.

Create a resource-based IAM policy. Use the Amazon Comprehend ImportModel API operation to copy the model to Account B.

Use AWS DataSync to replicate the model from Account A to Account B.

Create an AWS Site-to-Site VPN connection between Account A and Account В to transfer the model.

Buy Now

Questions 30

A company runs an Amazon SageMaker domain in a public subnet of a newly created VPC. The network is configured properly, and ML engineers can access the SageMaker domain.

Recently, the company discovered suspicious traffic to the domain from a specific IP address. The company needs to block traffic from the specific IP address.

Which update to the network configuration will meet this requirement?

Options:

Create a security group inbound rule to deny traffic from the specific IP address. Assign the security group to the domain.

Create a network ACL inbound rule to deny traffic from the specific IP address. Assign the rule to the default network Ad for the subnet where the domain is located.

Create a shadow variant for the domain. Configure SageMaker Inference Recommender to send traffic from the specific IP address to the shadow endpoint.

Create a VPC route table to deny inbound traffic from the specific IP address. Assign the route table to the domain.

Buy Now

Questions 31

A company uses Amazon SageMaker Studio to develop an ML model. The company has a single SageMaker Studio domain. An ML engineer needs to implement a solution that provides an automated alert when SageMaker compute costs reach a specific threshold.

Which solution will meet these requirements?

Options:

Add resource tagging by editing the SageMaker user profile in the SageMaker domain. Configure AWS Cost Explorer to send an alert when the threshold is reached.

Add resource tagging by editing the SageMaker user profile in the SageMaker domain. Configure AWS Budgets to send an alert when the threshold is reached.

Add resource tagging by editing each user's IAM profile. Configure AWS Cost Explorer to send an alert when the threshold is reached.

Add resource tagging by editing each user's IAM profile. Configure AWS Budgets to send an alert when the threshold is reached.

Buy Now

Questions 32

A company is developing an ML model by using Amazon SageMaker AI. The company must monitor bias in the model and display the results on a dashboard. An ML engineer creates a bias monitoring job.

How should the ML engineer capture bias metrics to display on the dashboard?

Options:

Capture AWS CloudTrail metrics from SageMaker Clarify.

Capture Amazon CloudWatch metrics from SageMaker Clarify.

Capture SageMaker Model Monitor metrics from Amazon EventBridge.

Capture SageMaker Model Monitor metrics from Amazon SNS.

Buy Now

Questions 33

A company regularly receives new training data from a vendor of an ML model. The vendor delivers cleaned and prepared data to the company’s Amazon S3 bucket every 3–4 days.

The company has an Amazon SageMaker AI pipeline to retrain the model. An ML engineer needs to run the pipeline automatically when new data is uploaded to the S3 bucket.

Which solution will meet these requirements with the LEAST operational effort?

Options:

Create an S3 lifecycle rule to transfer the data to the SageMaker AI training instance and initiate training.

Create an AWS Lambda function that scans the S3 bucket and initiates the pipeline when new data is uploaded.

Create an Amazon EventBridge rule that matches S3 upload events and configures the SageMaker pipeline as the target.

Use Amazon Managed Workflows for Apache Airflow (MWAA) to orchestrate the pipeline when new data is uploaded.

Buy Now

Questions 34

Which solution will meet these requirements?

Options:

Add resource tagging by editing the SageMaker AI user profile in the SageMaker AI domain. Configure AWS Cost Explorer to send an alert when the threshold is reached.

Add resource tagging by editing the SageMaker AI user profile in the SageMaker AI domain. Configure AWS Budgets to send an alert when the threshold is reached.

Add resource tagging by editing each user's IAM profile. Configure AWS Cost Explorer to send an alert when the threshold is reached.

Add resource tagging by editing each user's IAM profile. Configure AWS Budgets to send an alert when the threshold is reached.

Buy Now

Questions 35

A company's ML engineer has deployed an ML model for sentiment analysis to an Amazon SageMaker AI endpoint. The ML engineer needs to explain to company stakeholders how the model makes predictions.

Which solution will provide an explanation for the model's predictions?

Options:

Use SageMaker Model Monitor on the deployed model.

Use SageMaker Clarify on the deployed model.

Show the distribution of inferences from A/B testing in Amazon CloudWatch.

Add a shadow endpoint. Analyze prediction differences on samples.

Buy Now

Questions 36

An ML engineer needs to use Amazon SageMaker Feature Store to create and manage features to train a model.

Select and order the steps from the following list to create and use the features in Feature Store. Each step should be selected one time. (Select and order three.)

• Access the store to build datasets for training.

• Create a feature group.

• Ingest the records.

Options:

Buy Now

Questions 37

An ML engineer is setting up a CI/CD pipeline for an ML workflow in Amazon SageMaker AI. The pipeline must automatically retrain, test, and deploy a model whenever new data is uploaded to an Amazon S3 bucket. New data files are approximately 10 GB in size. The ML engineer also needs to track model versions for auditing.

Which solution will meet these requirements?

Options:

Use AWS CodePipeline, Amazon S3, and AWS CodeBuild to retrain and deploy the model automatically and track model versions.

Use SageMaker Pipelines with the SageMaker Model Registry to orchestrate model training and version tracking.

Use AWS Lambda and Amazon EventBridge to retrain and deploy the model and track versions via logs.

Manually retrain and deploy the model using SageMaker notebook instances and track versions with AWS CloudTrail.

Buy Now

Questions 38

A credit card company has a fraud detection model in production on an Amazon SageMaker endpoint. The company develops a new version of the model. The company needs to assess the new model's performance by using live data and without affecting production end users.

Which solution will meet these requirements?

Options:

Set up SageMaker Debugger and create a custom rule.

Set up blue/green deployments with all-at-once traffic shifting.

Set up blue/green deployments with canary traffic shifting.

Set up shadow testing with a shadow variant of the new model.

Buy Now

Questions 39

A financial company receives a high volume of real-time market data streams from an external provider. The streams consist of thousands of JSON records per second.

The company needs a scalable AWS solution to identify anomalous data points with the LEAST operational overhead.

Which solution will meet these requirements?

Options:

Ingest data into Amazon Kinesis Data Streams. Use the built-in RANDOM_CUT_FOREST function in Amazon Managed Service for Apache Flink to detect anomalies.

Ingest data into Kinesis Data Streams. Deploy a SageMaker AI endpoint and use AWS Lambda to detect anomalies.

Ingest data into Apache Kafka on Amazon EC2 and use SageMaker AI for detection.

Send data to Amazon SQS and use AWS Glue ETL jobs for batch anomaly detection.

Buy Now

Questions 40

A company needs to deploy a custom-trained classification ML model on AWS. The model must make near real-time predictions with low latency and must handle variable request volumes.

Which solution will meet these requirements?

Options:

Create an Amazon SageMaker AI batch transform job to process inference requests in batches.

Use Amazon API Gateway to receive prediction requests. Use an Amazon S3 bucket to host and serve the model.

Deploy an Amazon SageMaker AI endpoint. Configure auto scaling for the endpoint.

Launch AWS Deep Learning AMIs (DLAMI) on two Amazon EC2 instances. Run the instances behind an Application Load Balancer.

Buy Now

Questions 41

A company is building an Amazon SageMaker AI pipeline for an ML model. The pipeline uses distributed processing and training.

An ML engineer needs to encrypt network communication between instances that run distributed jobs. The ML engineer configures the distributed jobs to run in a private VPC.

What should the ML engineer do to meet the encryption requirement?

Options:

Enable network isolation.

Configure traffic encryption by using security groups.

Enable inter-container traffic encryption.

Enable VPC flow logs.

Buy Now

Questions 42

An ML engineer wants to run a training job on Amazon SageMaker AI by using multiple GPUs. The training dataset is stored in Apache Parquet format.

The Parquet files are too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

Attach an Amazon EBS Provisioned IOPS SSD volume and store the files on the EBS volume.

Repartition the Parquet files by using Apache Spark on Amazon EMR and use the repartitioned files for training.

Change to memory-optimized instance types with sufficient memory.

Use SageMaker distributed data parallelism (SMDDP) to split memory usage.

Buy Now

Questions 43

An ML engineer is training a simple neural network model. The ML engineer tracks the performance of the model over time on a validation dataset. The model's performance improves substantially at first and then degrades after a specific number of epochs.

Which solutions will mitigate this problem? (Choose two.)

Options:

Enable early stopping on the model.

Increase dropout in the layers.

Increase the number of layers.

Increase the number of neurons.

Investigate and reduce the sources of model bias.

Buy Now

Questions 44

A company wants to improve the sustainability of its ML operations.

Which actions will reduce the energy usage and computational resources that are associated with the company's training jobs? (Choose two.)

Options:

Use Amazon SageMaker Debugger to stop training jobs when non-converging conditions are detected.

Use Amazon SageMaker Ground Truth for data labeling.

Deploy models by using AWS Lambda functions.

Use AWS Trainium instances for training.

Use PyTorch or TensorFlow with the distributed training option.

Buy Now

Questions 45

An ML engineer needs to use AWS CloudFormation to create an ML model that an Amazon SageMaker endpoint will host.

Which resource should the ML engineer declare in the CloudFormation template to meet this requirement?

Options:

AWS::SageMaker::Model

AWS::SageMaker::Endpoint

AWS::SageMaker::NotebookInstance

AWS::SageMaker::Pipeline

Buy Now

Questions 46

A company has a custom extract, transform, and load (ETL) process that runs on premises. The ETL process is written in the R language and runs for an average of 6 hours. The company wants to migrate the process to run on AWS.

Which solution will meet these requirements?

Options:

Use an AWS Lambda function created from a container image to run the ETL jobs.

Use Amazon SageMaker AI processing jobs with a custom Docker image stored in Amazon Elastic Container Registry (Amazon ECR).

Use Amazon SageMaker AI script mode to build a Docker image. Run the ETL jobs by using SageMaker Notebook Jobs.

Use AWS Glue to prepare and run the ETL jobs.

Buy Now

Questions 47

A company is exploring generative AI and wants to add a new product feature. An ML engineer is making API calls from existing Amazon EC2 instances to Amazon Bedrock.

The EC2 instances are in a private subnet and must remain private during the implementation. The EC2 instances have a security group that allows access to all IP addresses in the private subnet.

What should the ML engineer do to establish a connection between the EC2 instances and Amazon Bedrock?

Options:

Modify the security group to allow inbound and outbound traffic to and from Amazon Bedrock.

Use AWS PrivateLink to access Amazon Bedrock through an interface VPC endpoint.

Configure Amazon Bedrock to use the private subnet where the EC2 instances are deployed.

Use AWS Direct Connect to link the VPC to Amazon Bedrock.

Buy Now

Questions 48

A gaming company needs to deploy a natural language processing (NLP) model to moderate a chat forum in a game. The workload experiences heavy usage during evenings and weekends but minimal activity during other hours.

Which solution will meet these requirements MOST cost-effectively?

Options:

Use an Amazon SageMaker AI batch transform job with fixed capacity.

Use Amazon SageMaker Serverless Inference.

Use a single Amazon EC2 GPU instance with reserved capacity.

Use Amazon SageMaker Asynchronous Inference.

Buy Now

Questions 49

A company must install a custom script on any newly created Amazon SageMaker AI notebook instances.

Which solution will meet this requirement with the LEAST operational overhead?

Options:

Create a lifecycle configuration script to install the custom script when a new SageMaker AI notebook is created. Attach the lifecycle configuration to every new SageMaker AI notebook as part of the creation steps.

Create a custom Amazon Elastic Container Registry (Amazon ECR) image that contains the custom script. Push the ECR image to a Docker registry. Attach the Docker image to a SageMaker Studio domain. Select the kernel to run as part of the SageMaker AI notebook.

Create a custom package index repository. Use AWS CodeArtifact to manage the installation of the custom script. Set up AWS PrivateLink endpoints to connect CodeArtifact to the SageMaker AI instance. Install the script.

Store the custom script in Amazon S3. Create an AWS Lambda function to install the custom script on new SageMaker AI notebooks. Configure Amazon EventBridge to invoke the Lambda function when a new SageMaker AI notebook is initialized.

Buy Now

Questions 50

A government agency is conducting a national census to assess program needs by area and city. The census form collects approximately 500 responses from each citizen. The agency needs to analyze the data to extract meaningful insights. The agency wants to reduce the dimensions of the high-dimensional data to uncover hidden patterns.

Which solution will meet these requirements?

Options:

Use the principal component analysis (PCA) algorithm in Amazon SageMaker AI.

Use the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm in Amazon SageMaker AI.

Use the k-means algorithm in Amazon SageMaker AI.

Use the Random Cut Forest (RCF) algorithm in Amazon SageMaker AI.

Buy Now

Questions 51

A company is building an enterprise AI platform. The company must catalog models for production, manage model versions, and associate metadata such as training metrics with models. The company needs to eliminate the burden of managing different versions of models.

Which solution will meet these requirements?

Options:

Use the Amazon SageMaker Model Registry to catalog the models. Create unique tags for each model version. Create key-value pairs to maintain associated metadata.

Use the Amazon SageMaker Model Registry to catalog the models. Create model groups for each model to manage the model versions and to maintain associated metadata.

Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model. Use the repositories to catalog the models and to manage model versions and associated metadata.

Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model. Create unique tags for each model version. Create key-value pairs to maintain associated metadata.

Buy Now

Questions 52

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model.

Which algorithm should the ML engineer use to meet this requirement?

Options:

LightGBM

Linear learner

К-means clustering

Neural Topic Model (NTM)

Buy Now

Questions 53

An ML engineer is building a generative AI application on Amazon Bedrock by using large language models (LLMs).

Select the correct generative AI term from the following list for each description. Each term should be selected one time or not at all. (Select three.)

• Embedding

• Retrieval Augmented Generation (RAG)

• Temperature

• Token

Options:

Buy Now

Questions 54

An ML engineer is preparing a dataset that contains medical records to train an ML model to predict the likelihood of patients developing diseases.

The dataset contains columns for patient ID, age, medical conditions, test results, and a "Disease" target column.

How should the ML engineer configure the data to train the model?

Options:

Remove the patient ID column.

Remove the age column.

Remove the medical conditions and test results columns.

Remove the "Disease" target column.

Buy Now

Questions 55

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company needs to use the central model registry to manage different versions of models in the application.

Which action will meet this requirement with the LEAST operational overhead?

Options:

Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model.

Use Amazon Elastic Container Registry (Amazon ECR) and unique tags for each model version.

Use the SageMaker Model Registry and model groups to catalog the models.

Use the SageMaker Model Registry and unique tags for each model version.

Buy Now

Questions 56

A company wants to deploy an Amazon SageMaker AI model that can queue requests. The model needs to handle payloads of up to 1 GB that take up to 1 hour to process. The model must return an inference for each request. The model also must scale down when no requests are available to process.

Which inference option will meet these requirements?

Options:

Asynchronous inference

Batch transform

Serverless inference

Real-time inference

Buy Now

Questions 57

Which solution will meet these requirements with the LEAST development effort?

Options:

Use SageMaker AI built-in algorithms to train the proprietary datasets.

Use SageMaker AI script mode and premade images for ML frameworks.

Build a container on AWS that includes custom packages and a choice of ML frameworks.

Purchase similar production models through AWS Marketplace.

Buy Now

Questions 58

An ML engineer decides to use Amazon SageMaker AI automated model tuning (AMT) for hyperparameter optimization (HPO). The ML engineer requires a tuning strategy that uses regression to slowly and sequentially select the next set of hyperparameters based on previous runs. The strategy must work across small hyperparameter ranges.

Which solution will meet these requirements?

Options:

Grid search

Random search

Bayesian optimization

Hyperband

Buy Now

Questions 59

A healthcare analytics company wants to segment patients into groups that have similar risk factors to develop personalized treatment plans. The company has a dataset that includes patient health records, medication history, and lifestyle changes. The company must identify the appropriate algorithm to determine the number of groups by using hyperparameters.

Which solution will meet these requirements?

Options:

Use the Amazon SageMaker AI XGBoost algorithm. Set max_depth to control tree complexity for risk groups.

Use the Amazon SageMaker k-means clustering algorithm. Set k to specify the number of clusters.

Use the Amazon SageMaker AI DeepAR algorithm. Set epochs to determine the number of training iterations for risk groups.

Use the Amazon SageMaker AI Random Cut Forest (RCF) algorithm. Set a contamination hyperparameter for risk anomaly detection.

Buy Now

Questions 60

Case study

The training dataset includes categorical data and numerical data. The ML engineer must prepare the training dataset to maximize the accuracy of the model.

Which action will meet this requirement with the LEAST operational overhead?

Options:

Use AWS Glue to transform the categorical data into numerical data.

Use AWS Glue to transform the numerical data into categorical data.

Use Amazon SageMaker Data Wrangler to transform the categorical data into numerical data.

Use Amazon SageMaker Data Wrangler to transform the numerical data into categorical data.

Buy Now

Answer:

Explanation:

Preparing a training dataset that includes both categorical and numerical data is essential for maximizing the accuracy of a machine learning model. Transforming categorical data into numerical format is a critical step, as most ML algorithms require numerical input.

Why Transform Categorical Data into Numerical Data?

Model Compatibility: Many ML algorithms cannot process categorical data directly and require numerical representations.

Improved Performance: Proper encoding of categorical variables can enhance model accuracy and convergence speed.

Why Use Amazon SageMaker Data Wrangler?

Amazon SageMaker Data Wrangler offers a visual interface with over 300 built-in data transformations, including tools for encoding categorical variables.

Implementation Steps:

Import Data:

Load the dataset into SageMaker Data Wrangler from sources like Amazon S3 or on-premises databases.

Identify Categorical Features:

Use Data Wrangler's data type inference to detect categorical columns.

Apply Categorical Encoding:

Choose appropriate encoding techniques (e.g., one-hot encoding or ordinal encoding) from Data Wrangler's transformation options.

Apply the selected transformation to convert categorical features into numerical format.

Validate Transformations:

Review the transformed dataset to ensure accuracy and completeness.

Advantages of Using SageMaker Data Wrangler:

Ease of Use: Provides a user-friendly interface for data transformation without extensive coding.

Operational Efficiency: Integrates data preparation steps, reducing the need for multiple tools and minimizing operational overhead.

Flexibility: Supports various data sources and transformation techniques, accommodating diverse datasets.

By utilizing SageMaker Data Wrangler to transform categorical data into numerical format, the ML engineer can efficiently prepare the dataset, thereby enhancing the model's accuracy with minimal operational overhead.

Transform Data - Amazon SageMaker

Prepare ML Data with Amazon SageMaker Data Wrangler

Questions 61

An ML engineer is using an Amazon SageMaker AI shadow test to evaluate a new model that is hosted on a SageMaker AI endpoint. The shadow test requires significant GPU resources for high performance. The production variant currently runs on a less powerful instance type.

The ML engineer needs to configure the shadow test to use a higher performance instance type for a shadow variant. The solution must not affect the instance type of the production variant.

Which solution will meet these requirements?

Options:

Modify the existing ProductionVariant configuration in the endpoint to include a ShadowProductionVariants list. Specify the larger instance type for the shadow variant.

Create a new endpoint configuration with two ProductionVariant definitions. Configure one definition for the existing production variant and one definition for the shadow variant with the larger instance type. Use the UpdateEndpoint action to apply the new configuration.

Create a separate SageMaker AI endpoint for the shadow variant that uses the larger instance type. Create an AWS Lambda function that routes a portion of the traffic to the shadow endpoint. Assign the Lambda function to the original endpoint.

Use the CreateEndpointConfig action to define a new configuration. Specify the existing production variant in the configuration and add a separate ShadowProductionVariants list. Specify the larger instance type for the shadow variant. Use the CreateEndpoint action and pass the new configuration to the endpoint.

Buy Now

Questions 62

A company uses Amazon SageMaker AI to create ML models. The data scientists need fine-grained control of ML workflows, DAG visualization, experiment history, and model governance for auditing and compliance.

Which solution will meet these requirements?

Options:

Use AWS CodePipeline with SageMaker Studio and SageMaker ML Lineage Tracking.

Use AWS CodePipeline with SageMaker Experiments.

Use SageMaker Pipelines with SageMaker Studio and SageMaker ML Lineage Tracking.

Use SageMaker Pipelines with SageMaker Experiments.

Buy Now

Exam Code: MLA-C01

Exam Name: AWS Certified Machine Learning Engineer - Associate

Last Update: Feb 13, 2026

Questions: 207

PDF + Testing Engine

$57.75 ~~$164.99~~

Testing Engine (only)

$43.75 ~~$124.99~~

PDF (only)

$36.75 ~~$104.99~~

Spring Sale - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 65percent

dumpspedia logo

Navigation:

MLA-C01 Sample Questions Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options: