Building Transformative AI Solutions With NVIDIA DGX™ Cloud

7 min read

Introduction

As AI adoption continues to mature across industries, having the right infrastructure to support complex, compute-intensive workloads is more important than ever.

NVIDIA DGX Cloud is an advanced AI computing platform designed to handle the performance and scalability needs required for training and inferencing large-scale AI models. As part of a selected research program, NVIDIA provided access to its DGX platform. By using Google Cloud H100 clusters as a key component of the DGX Cloud infrastructure, we leveraged their computational power and flexibility to fine-tune and test a range of AI solutions, including multimodal RAG with domain/language adaptation and code translation.

To demonstrate the impact of NVIDIA DGX Cloud, SoftServe developed and tested a series of targeted Gen AI solutions: from adapting language models for underrepresented languages to generating synthetic data for more reliable large language models (LLMs) to accelerating the translation of legacy code into modern alternatives. Each project highlights how the right infrastructure supports the demands of modern AI.

Project highlights: real-world Gen AI applications with NVIDIA DGX Cloud

We began by addressing the challenges of building accurate AI solutions for underrepresented languages, with a focus on Arabic.

RAG for Arabic

AI challenges in Arabic processing: Arabic presents unique challenges for AI, including biases in models, limited language coverage, technical barriers, and regulatory constraints, which hinder effective utilization in various applications.
Benefits of Arabic-specific LLMs: Implementing Arabic-specific language models improves accuracy, reduces costs, enhances user experience, and ensures compliance with local regulations.
Solution overview: SoftServe's Arabic Multimodal RAG application incorporates text, tables, and images from Arabic PDFs. It leverages advanced AI pipelines for parsing and retrieval, tailored to the specific linguistic features of Arabic through custom language-adapted models.
Outcomes: The system demonstrated improved performance through fine-tuning specialized Arabic datasets, outperforming existing cloud-based models, and showcasing the potential for scalable and cost-effective solutions in Arabic language processing.

Fine-tuning for smarter Gen AI with synthetic data

Building on the success of our language-focused work, we next explored how synthetic data can overcome some of the biggest hurdles in adapting LLMs to domain-specific use cases.

Challenges addressed by synthetic data: Common challenges in adapting LLMs to specific domains — such as data scarcity, lack of domain diversity, and the high cost of labeled data — can be effectively addressed using synthetic datasets.
Benefits of adapting LLMs with synthetic data: Synthetic data offers an efficient and cost-effective alternative to traditional datasets. These datasets help improve model reliability, domain relevance, and generalization by supplementing or replacing limited real-world data.
Solution overview:
- Project #1: Create QA data. This project involved using Gen AI models to create question-answer (QA) pairs based on real-world data. The model performance was tested after fine-tuning it with synthetic datasets, using an LLM as an evaluator to compare outputs to ground truth data for accuracy and reliability.
- Project #2: Train a chatbot with synthetic data. In this project, synthetic data was generated to improve the performance of a Gen AI model. The data included technical specifications from different devices, like servers, to build a chatbot that understands and works well with this type of data.
Outcomes: These two projects demonstrated how synthetic data can enable the successful adaptation of LLMs: one by generating QA pairs for training, and the other by enhancing chatbot reliability through technical data. Both projects demonstrate synthetic data's role in making LLMs more accurate and domain-tuned.

Code translation tuning

We also turned our attention to a challenge facing many enterprises today: how to modernize legacy systems using AI-driven code translation.

Legacy system challenges: Organizations need to transition from outdated languages like COBOL to scalable ones like Python, due to scalability, integration issues, a shrinking developer pool, security vulnerabilities, and high maintenance costs. Python's concise syntax, strong community support, and compatibility with cloud technologies facilitate easier maintenance, scalability, and improved performance over traditional COBOL systems. Benefits of code transition tuning: Sectors like banking, insurance, and government can greatly benefit from migrating to Python, enhancing efficiency and security while reducing operational costs.
Solution overview: Various methodologies, including proprietary models and control agents, have been developed to automate COBOL-to-Python translation, each with unique advantages and drawbacks. The fine-tuning custom model approach ensures high accuracy and error correction by translating COBOL logic into Python, validating outputs against original COBOL results, and refining code as needed.
Outcomes: The transition from COBOL to Python is vital for organizations to remain competitive, with AI tools significantly easing the migration process while preserving essential business logic.

Technical details

In our projects, we employed a unified approach to support LLM customization, ensuring a streamlined flow across data preparation, fine-tuning, evaluation, and model storage. Our approach adhered to industry best practices, prioritizing scalability and efficiency, enabled by the enlisted toolset:

PHASE	STEPS TAKEN	TOOLS USED
Data preparation	OCR, chunked text	Google Document AI, Custom scripts
NIM LLM deployment	Deployment of foundational and fine-tuned models	NIM, vllm
Fine-tuning foundation models	Fine-tuned models with prepared data	NVIDIA GPUs, PyTorch, Hugging Face Transformers, NIM Customizer
Training job (was this a distributed job?)	6/4 H100 GPUs	NVIDIA GPUs
Experimentation, tracking, and evaluation	Compared tuned vs. base models, tracked performance	Weights & Biases, Langfuse, DeepEval
Deployment of fine-tuning checkpoint	Saved and ready for deployment	Google Cloud Storage (GCS)

Conclusion

Thanks to NVIDIA’s support and collaboration with SofServe’s Gen AI Lab, we advanced our expertise and enabled the development of several proof of concepts. Across use cases like multimodal RAG, synthetic data generation, and legacy code translation, NVIDIA DGX demonstrates strategic value in accelerating AI adoption at scale. Common threads included enabling high-performance model execution and supporting data-intensive workflows for complex Gen AI tasks.

Start a conversation with us