What are some methods to overcome limited throughput between CPU and GPU? (Pick the 2 correct responses)
When deploying an LLM using NVIDIA Triton Inference Server for a real-time chatbot application, which optimization technique is most effective for reducing latency while maintaining high throughput?
When designing an experiment to compare the performance of two LLMs on a question-answering task, which statistical test is most appropriate to determine if the difference in their accuracy is significant, assuming the data follows a normal distribution?
Which technique is used in prompt engineering to guide LLMs in generating more accurate and contextually appropriate responses?
Which of the following prompt engineering techniques is most effective for improving an LLM's performance on multi-step reasoning tasks?
Which calculation is most commonly used to measure the semantic closeness of two text passages?
You have developed a deep learning model for a recommendation system. You want to evaluate the performance of the model using A/B testing. What is the rationale for using A/B testing with deep learning model performance?
In the field of AI experimentation, what is the GLUE benchmark used to evaluate performance of?
In the context of preparing a multilingual dataset for fine-tuning an LLM, which preprocessing technique is most effective for handling text from diverse scripts (e.g., Latin, Cyrillic, Devanagari) to ensure consistent model performance?
What metrics would you use to evaluate the performance of a RAG workflow in terms of the accuracy of responses generated in relation to the input query? (Choose two.)
Which of the following is a key characteristic of Rapid Application Development (RAD)?
When implementing data parallel training, which of the following considerations needs to be taken into account?
In the context of evaluating a fine-tuned LLM for a text classification task, which experimental design technique ensures robust performance estimation when dealing with imbalanced datasets?
Which metric is primarily used to evaluate the quality of the text generated by language models?
What is the main consequence of the scaling law in deep learning for real-world applications?
Which principle of Trustworthy AI primarily concerns the ethical implications of AI's impact on society and includes considerations for both potential misuse and unintended consequences?
Which model deployment framework is used to deploy an NLP project, especially for high-performance inference in production environments?
In the context of a natural language processing (NLP) application, which approach is most effective for implementing zero-shot learning to classify text data into categories that were not seen during training?
When comparing and contrasting the ReLU and sigmoid activation functions, which statement is true?