AIResearch
Beginning to train our LLMCA-1-1B model
February 10, 2025·1 min read

Starting February 10, 2025, we will begin training our LLMCA 1 model. The LLMCA (Large Language Model Coozy AI) is a decoder-only autoregressive transformer model consisting of 1 billion model parameters.
The model will be trained on a custom-curated dataset comprising approximately 1.47 billion tokens made up of a combination of publicly available corpora and meticulously curated data to ensure examples across diverse domains.
We hope to push the boundaries of training advanced language models on a smaller scale.
More updates will be shared as we progress towards the end of the training phase.