OpenAI’s GPT-4 language model is more trustworthy but also more susceptible to jailbreaking and bias than its predecessor, GPT-3.5, according to a study backed by Microsoft. The research found GPT-4 better at protecting private information and resisting adversarial attacks, but also more likely to follow misleading information and tricky prompts. The vulnerabilities were not found in consumer-facing GPT-4-based products due to mitigation approaches applied in finished AI applications. The researchers aim to encourage further work to build on these findings and create more trustworthy models.