The creativity of young people seems hard to perceive when I was a high school student. I learned that Einstein created his Relativity when he was 24. Now I finally understand, to make contributions in one’s own field, one need to make his/her mind in specific environment and develop representations about the subject first. This process takes time, and it is hard for someone to switch to a different field in a short period of time. This is the shift of representations, or more intuitively, the shift of the ability of discrepency.

Representations of data is important. One need to discriminate all the concepts in a subject to make derivation between different concepts and entities. So, a good representation is the foundation of creativity.

To develop a good representation in a specific field, one need to construct basic and simple concepts first. Built upon basic operations, it is possible to make deeper representations about more difficult concepts using previous representations. For human, it is normal to learn basic knowledge first, and learn harder knowledge later. For LLMs, it may have a learning-frontier, just like the diagram in Lilian Weng’s blog about data quality. Data at the frontier will be continously learned - or have the fastest rate of learning.

As is mentioned by Ilya, to understand is the most important part in LLMs, generation of next token is not that important.