Ucertify offers free demo for AWS-Certified-Machine-Learning-Specialty exam. "AWS Certified Machine Learning - Specialty", also known as AWS-Certified-Machine-Learning-Specialty exam, is a Amazon Certification. This set of posts, Passing the Amazon AWS-Certified-Machine-Learning-Specialty exam, will help you answer those questions. The AWS-Certified-Machine-Learning-Specialty Questions & Answers covers all the knowledge points of the real exam. 100% real Amazon AWS-Certified-Machine-Learning-Specialty exams and revised by experts!

Free AWS-Certified-Machine-Learning-Specialty Demo Online For Amazon Certifitcation:

NEW QUESTION 1
A Machine Learning Specialist is configuring automatic model tuning in Amazon SageMaker
When using the hyperparameter optimization feature, which of the following guidelines should be followed to improve optimization?
Choose the maximum number of hyperparameters supported by

  • A. Amazon SageMaker to search the largest number of combinations possible
  • B. Specify a very large hyperparameter range to allow Amazon SageMaker to cover every possible value.
  • C. Use log-scaled hyperparameters to allow the hyperparameter space to be searched as quickly as possible
  • D. Execute only one hyperparameter tuning job at a time and improve tuning through successive rounds of experiments

Answer: C

NEW QUESTION 2
A machine learning specialist is running an Amazon SageMaker endpoint using the built-in object detection algorithm on a P3 instance for real-time predictions in a company's production application. When evaluating the model's resource utilization, the specialist notices that the model is using only a fraction of the GPU.
Which architecture changes would ensure that provisioned resources are being utilized effectively?

  • A. Redeploy the model as a batch transform job on an M5 instance.
  • B. Redeploy the model on an M5 instanc
  • C. Attach Amazon Elastic Inference to the instance.
  • D. Redeploy the model on a P3dn instance.
  • E. Deploy the model onto an Amazon Elastic Container Service (Amazon ECS) cluster using a P3 instance.

Answer: B

Explanation:
https://aws.amazon.com/machine-learning/elastic-inference/

NEW QUESTION 3
A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest.
Which next step is MOST likely to improve the data ingestion rate into Amazon S3?

  • A. Increase the number of S3 prefixes for the delivery stream to write to.
  • B. Decrease the retention period for the data stream.
  • C. Increase the number of shards for the data stream.
  • D. Add more consumers using the Kinesis Client Library (KCL).

Answer: C

NEW QUESTION 4
A Data Science team is designing a dataset repository where it will store a large amount of training data commonly used in its machine learning models. As Data Scientists may create an arbitrary number of new datasets every day the solution has to scale automatically and be cost-effective. Also, it must be possible to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?

  • A. Store datasets as files in Amazon S3.
  • B. Store datasets as files in an Amazon EBS volume attached to an Amazon EC2 instance.
  • C. Store datasets as tables in a multi-node Amazon Redshift cluster.
  • D. Store datasets as global tables in Amazon DynamoDB.

Answer: A

NEW QUESTION 5
A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a continuous value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study was conducted on a group of individuals over the age of 65 who have a particular disease that is known to worsen with age.
Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that, out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other features for these observations appear normal compared to the rest of the sample population.
How should the Data Scientist correct this issue?

  • A. Drop all records from the dataset where age has been set to 0.
  • B. Replace the age field value for records with a value of 0 with the mean or median value from the dataset.
  • C. Drop the age feature from the dataset and train the model using the rest of the features.
  • D. Use k-means clustering to handle missing features.

Answer: A

NEW QUESTION 6
A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training, and needs to continue working for an extended period with no Wi-Fi access.
Which approach should the Specialist use to continue working?

  • A. Install Python 3 and boto3 on their laptop and continue the code development using that environment.
  • B. Download the TensorFlow Docker container used in Amazon SageMaker from GitHub to their local environment, and use the Amazon SageMaker Python SDK to test the code.
  • C. Download TensorFlow from tensorflow.org to emulate the TensorFlow kernel in the SageMaker environment.
  • D. Download the SageMaker notebook to their local environment then install Jupyter Notebooks on their laptop and continue the development in a local notebook.

Answer: D

NEW QUESTION 7
A company wants to use automatic speech recognition (ASR) to transcribe messages that are less than 60 seconds long from a voicemail-style application. The company requires the correct identification of 200 unique product names, some of which have unique spellings or pronunciations.
The company has 4,000 words of Amazon SageMaker Ground Truth voicemail transcripts it can use to customize the chosen ASR model. The company needs to ensure that everyone can update their customizations multiple times each hour.
Which approach will maximize transcription accuracy during the development phase?

  • A. Use a voice-driven Amazon Lex bot to perform the ASR customizatio
  • B. Create customer slots within the bot that specifically identify each of the required product name
  • C. Use the Amazon Lex synonym mechanism to provide additional variations of each product name as mis-transcriptions are identified in development.
  • D. Use Amazon Transcribe to perform the ASR customizatio
  • E. Analyze the word confidence scores in the transcript, and automatically create or update a custom vocabulary file with any word that has a confidence score below an acceptable threshold valu
  • F. Use this updated custom vocabulary file in all future transcription tasks.
  • G. Create a custom vocabulary file containing each product name with phonetic pronunciations, and use it with Amazon Transcribe to perform the ASR customizatio
  • H. Analyze the transcripts and manually update the custom vocabulary file to include updated or additional entries for those names that are not being correctly identified.
  • I. Use the audio transcripts to create a training dataset and build an Amazon Transcribe custom language mode
  • J. Analyze the transcripts and update the training dataset with a manually corrected version of transcripts where product names are not being transcribed correctl
  • K. Create an updated custom language model.

Answer: A

NEW QUESTION 8
A data scientist uses an Amazon SageMaker notebook instance to conduct data exploration and analysis. This requires certain Python packages that are not natively available on Amazon SageMaker to be installed on the notebook instance.
How can a machine learning specialist ensure that required packages are automatically available on the notebook instance for the data scientist to use?

  • A. Install AWS Systems Manager Agent on the underlying Amazon EC2 instance and use Systems Manager Automation to execute the package installation commands.
  • B. Create a Jupyter notebook file (.ipynb) with cells containing the package installation commands to execute and place the file under the /etc/init directory of each Amazon SageMaker notebook instance.
  • C. Use the conda package manager from within the Jupyter notebook console to apply the necessary conda packages to the default kernel of the notebook.
  • D. Create an Amazon SageMaker lifecycle configuration with package installation commands and assign the lifecycle configuration to the notebook instance.

Answer: D

Explanation:
https://docs.aws.amazon.com/sagemaker/latest/dg/nbi-add-external.html

NEW QUESTION 9
A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s)
Which visualization will accomplish this?

  • A. A histogram showing whether the most important input feature is Gaussian.
  • B. A scatter plot with points colored by target variable that uses (-Distributed Stochastic Neighbor Embedding (I-SNE) to visualize the large number of input variables in an easier-to-read dimension.
  • C. A scatter plot showing (he performance of the objective metric over each training iteration
  • D. A scatter plot showing the correlation between maximum tree depth and the objective metric.

Answer: D

NEW QUESTION 10
A manufacturer of car engines collects data from cars as they are being driven The data collected includes timestamp, engine temperature, rotations per minute (RPM), and other sensor readings The company wants to predict when an engine is going to have a problem so it can notify drivers in advance to get engine maintenance The engine data is loaded into a data lake for training
Which is the MOST suitable predictive model that can be deployed into production'?

  • A. Add labels over time to indicate which engine faults occur at what time in the future to turn this into a supervised learning problem Use a recurrent neural network (RNN) to train the model to recognize when an engine might need maintenance for a certain fault.
  • B. This data requires an unsupervised learning algorithm Use Amazon SageMaker k-means to cluster the data
  • C. Add labels over time to indicate which engine faults occur at what time in the future to turn this into a supervised learning problem Use a convolutional neural network (CNN) to train the model to recognize when an engine might need maintenance for a certain fault.
  • D. This data is already formulated as a time series Use Amazon SageMaker seq2seq to model the time series.

Answer: B

NEW QUESTION 11
A retail company uses a machine learning (ML) model for daily sales forecasting. The company’s brand manager reports that the model has provided inaccurate results for the past 3 weeks.
At the end of each day, an AWS Glue job consolidates the input data that is used for the forecasting with the actual daily sales data and the predictions of the model. The AWS Glue job stores the data in Amazon S3. The company’s ML team is using an Amazon SageMaker Studio notebook to gain an understanding about the source of the model's inaccuracies.
What should the ML team do on the SageMaker Studio notebook to visualize the model's degradation MOST accurately?

  • A. Create a histogram of the daily sales over the last 3 week
  • B. In addition, create a histogram of the daily sales from before that period.
  • C. Create a histogram of the model errors over the last 3 week
  • D. In addition, create a histogram of the model errors from before that period.
  • E. Create a line chart with the weekly mean absolute error (MAE) of the model.
  • F. Create a scatter plot of daily sales versus model error for the last 3 week
  • G. In addition, create a scatter plot of daily sales versus model error from before that period.

Answer: C

NEW QUESTION 12
An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?

  • A. Create one-hot word encoding vectors.
  • B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
  • C. Create word embedding factors that store edit distance with every other word.
  • D. Download word embedding’s pre-trained on a large corpus.

Answer: D

NEW QUESTION 13
A Data Scientist is working on an application that performs sentiment analysis. The validation accuracy is poor and the Data Scientist thinks that the cause may be a rich vocabulary and a low average frequency of words in the dataset
Which tool should be used to improve the validation accuracy?

  • A. Amazon Comprehend syntax analysts and entity detection
  • B. Amazon SageMaker BlazingText allow mode
  • C. Natural Language Toolkit (NLTK) stemming and stop word removal
  • D. Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers

Answer: A

NEW QUESTION 14
A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket A Machine Learning Specialist wants to use SQL to run queries on this data. Which solution requires the LEAST effort to be able to query this data?

  • A. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
  • B. Use AWS Glue to catalogue the data and Amazon Athena to run queries
  • C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the quenes
  • D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run queries

Answer: D

NEW QUESTION 15
A Data Scientist received a set of insurance records, each consisting of a record ID, the final outcome among 200 categories, and the date of the final outcome. Some partial information on claim contents is also provided, but only for a few of the 200 categories. For each outcome category, there are hundreds of records distributed over the past 3 years. The Data Scientist wants to predict how many claims to expect in each category from month to month, a few months in advance.
What type of machine learning model should be used?

  • A. Classification month-to-month using supervised learning of the 200 categories based on claim contents.
  • B. Reinforcement learning using claim IDs and timestamps where the agent will identify how many claims ineach category to expect from month to month.
  • C. Forecasting using claim IDs and timestamps to identify how many claims in each category to expect frommonth to month.
  • D. Classification with supervised learning of the categories for which partial information on claim contents isprovided, and forecasting using claim IDs and timestamps for all other categories.

Answer: C

NEW QUESTION 16
A credit card company wants to build a credit scoring model to help predict whether a new credit card applicant will default on a credit card payment. The company has collected data from a large number of sources with thousands of raw attributes. Early experiments to train a classification model revealed that many attributes are highly correlated, the large number of features slows down the training speed significantly, and that there are some overfitting issues.
The Data Scientist on this project would like to speed up the model training time without losing a lot of information from the original dataset.
Which feature engineering technique should the Data Scientist use to meet the objectives?

  • A. Run self-correlation on all features and remove highly correlated features
  • B. Normalize all numerical values to be between 0 and 1
  • C. Use an autoencoder or principal component analysis (PCA) to replace original features with new features
  • D. Cluster raw data using k-means and use sample data from each cluster to build a new dataset

Answer: B

NEW QUESTION 17
......

100% Valid and Newest Version AWS-Certified-Machine-Learning-Specialty Questions & Answers shared by Surepassexam, Get Full Dumps HERE: https://www.surepassexam.com/AWS-Certified-Machine-Learning-Specialty-exam-dumps.html (New 208 Q&As)