Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

AI Briefing: Writer’s CTO on how to make AI models think more creatively


Finding ways to make them more creative and differentiated is increasingly important, especially if the training data is similar across large, large language models. This reality has more enterprise customers asking for ways to make AI more creative in generating content—and to help with the actual creative thinking process.

Last month, AI startup Writer released a new LLM called Palmyra Creative which aims to help enterprise enterprises squeeze more creativity out of generative artificial intelligence. The goal is not just to help with outputs; it is also intended to help companies that use artificial intelligence in a more creative way. Palmyra Creative follows other domain-specific LLMs released by Writer, such as the one focused on healthcare Palmyra Med and the finance company Palmyra Fin. (Writer customers using various models include Qualcomm, Vanguard, Salesforce, Kenvue, Uber, and Dropbox.)

In terms of creative thinking, AI models have generally evolved quite a bit over the past few years. Some experts have found that LLMs are more creative than humans in areas such as divergent thinking. Last year, researchers from the University of Arkansas published an article exploring how the GPT-4 OpenAI model is able to generate more creative ideas, find different solutions to problems and explore different points of view. However, current LLMs are still largely limited to their own knowledge through training data – rather than lived experience or lessons learned that humans are able to draw upon.

The writer’s process involves creating AI models that adapt themselves or “self-developing”, said Writer CTO Waseem Al Shikh, who co-founded the company with Writer CEO May Habib in 2020. Rather than worrying about the size of the model itself, Shikh explained that the company is now focusing on developing models with a frame built on three separate buckets. : model knowledge, model reasoning and model behavior.

“It’s not enough to just have a creative model,” Al Shikh told Digiday last month. “It’s like a human, isn’t it? If you all have the same libraries with lots of books, everyone will come up with ideas, but the funny thing is that we don’t create all the ideas with just one clear theme. So the plan going forward is to have all our models self-developing features and creativity being at the top of the list.”

Writer updates also benefit from corporate ones partnership with Nvidia through use NIM — short for Nvidia Inference Microservices — which help simplify and accelerate the deployment and scaling of AI models various business specific uses. In a way, NIM serves as a flight controller that helps decide which AI model to use and when, depending on the company, its expertise, and the task at hand.

“With workflows, you know the beginning and the steps,” Shikh said. “This NIM concept is very futuristic, we can get there, but you will need all these models. This is why we build domain-specific models. You can have three, four or five specific models and they evolve themselves based on customer behavior.”

Unlocking new ways of thinking more creatively could provide marketers and others with new ways to find new ideas, break out of AI echo, and escape the uniform patterns that plague many AI outputs. Writer sees retailers potentially using Palmyra Creative for personalized marketing campaigns or extended loyalty programs. The models could help healthcare providers simplify communication with patients, equip financial firms to create more educational tools, or provide B2B technology companies with ideas for product placement and refinement of technical documents.

This conversation has been edited for brevity and clarity.

What makes Palmyra Creative different from other models?

Our larger model, and larger models—such as finance or medicine—focus more on what we call knowledge. We want them to be accurate for every single formula and every single drug they use. When you move to a financial model, it’s about focusing on basic reasoning and mathematical equations. Behavior will also change. General models try to balance between them [knowledge, reasoning and behavior].

How was the model development process different?

Since all the models have similar architecture and similar training data, you know it’s just a matter of looking for similarities with the weights and how much that weight actually looks like. What we decided to do is actually take the same training data that we have today, but we got more creative with creative weights. We trained three separate models and then started merging the models and swapping them between layers. What happens then is that you have a unique relationship that does not exist in any other model. We also found that the model has an interesting behavior – the model can actually push back and not follow the traditional path of everyone else because the weight is very unique to the model itself. We call this dynamic merging between layers.

Model fusion is not a new idea, but the technique itself and the use of the technique is new. The other thing is that we split the model between them and we have a specific way of making sure that the relationship between them is not broken, so you don’t end up with gibberish or weird hallucinations. It’s a fine line between what ends up being a hallucination and what creativity looks like.

It reminds me how creativity often happens on the blurred line between fact and fiction.

One hundred percent. But we have to define it, especially for business customers. Finally, we say that we want the model to say whatever it wants, but we need the model to be careful about one thing, which we call assertions. There’s a difference between “let me give you a crazy idea” and a claim that seemed out of control. We’ve done a lot of work around what we call controlled claims. We have no source of truth [for the model] because we can’t consider Wikipedia, for example, as a source of truth, can we? It has a lot of random stuff. We cannot take every single thing from every single government on the planet as a source of truth. But we decided to say that the model will remain creative, but do not claim a statement.

Hallucinations often come with the additional question of explainability when they have to justify themselves. Maybe it’s less of a problem without having to verify the claims?

Exactly. We decided to start from the root and check the claims… The [Palmyra] The creative model is less about knowledge and more about behavior. we think businesses will love this creative model for writing a case study or finding new use cases or writing more creative stories about how to adopt their products and how you can explain it without sounding like AI. But the debt control was the biggest part. Like you said, if you’re not eligible, you don’t have to explain.

How do you control the model when it should evolve or be creative and when it should be consistent?

We’ve been working on it since the beginning of summer. What if we could make these models think more like a human? What if models could bounce, rotate and remember? Can we basically get those to start working off the training set in real time? All models today are still stuck to training data – without training data it’s really hard to get them to do anything. This is what we call self-evolving. Self-evolving models mean you don’t have to teach them. The model will update their weight in real time. The model will actually reflect. And the model itself can actually provide information.

To give a bad example: If I say my name is Waseem and I’m the President of the United States, the model will be smart enough to know, “You may be Waseem, but you’re not the President of the United States.” This thing is really important, which means that if you use it more, the model will gain more control and more knowledge. It’s higher level and will take a lot of time to explain, but it’s a standard transformer design with a new feature called Memory. Each layer inside a neural network has a memory layer next to it. So you can actually talk to it and see it change.

Because the model basically won’t make the same mistake twice because we know the wrong answer. He remembers it wrong [one] and next time we will try it differently when we think about the question. I like to tell my team that most people – not all – learn from their mistakes and we don’t make the same mistakes twice.

Prompts & Products — AI news and announcements this week

  • Company Rembrand, a a generative AI startup that helps brands place virtual products on social networks and other content, raised $23 million in Series A funding.
  • Lucid Motors, an electric car company, is partnership with SoundHound AI to integrate a new in-vehicle voice assistant into cars to provide drivers with real-time information and more in-vehicle controls.
  • New campaign by TurboTax is promoting AI agents and “AI experts” to the Intuit-owned app to help people file their taxes.
  • AI will be all over Las Vegas next week CES 2025 as tech giants, startups and brands descend on the Nevada desert to promote their various updates and partnerships.

AI stories from across Digiday



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *