By Catherine Card, Director of Product Management

Around the world, a death by suicide occurs every 40 seconds and is the second-leading cause of death for 15- to 29-year-olds. In the US, nearly 45,000 people take their life every year.

When someone is expressing thoughts of suicide, it’s important to get them help as quickly as possible. Because friends and family are connected through Facebook, we can help a person in distress get in touch with people who can support them. For years, people have had the ability to report Facebook posts that they feel indicate someone is thinking about suicide. This flags the posts for review by trained members of our Community Operations team, who can connect the poster with support resources if needed.

Last year, we began to use machine learning to expand our ability to get timely help to people in need. This tool uses signals to identify posts from people who might be at risk, such as phrases in posts and concerned comments from friends and family.

Actually getting a computer to recognize suicidal expression, however, was a complex exercise in analyzing human nuance. One of the biggest challenges the team faced was that so many phrases that might indicate suicidal intent — “kill,” “die,” “goodbye” — are commonly used in other contexts. A human being might recognize that “I have so much homework I want to kill myself” is not a genuine cry of distress, but how do you teach a computer that kind of contextual understanding? Until the team managed to solve that problem, the machine learning model caught too many harmless false positives to be a useful filter for the human review team.

To train a machine learning classifier, you need to feed it tons of examples, both of what you’re trying to identify (positive examples) as well as what you’re not trying to identify (negative examples), so that the classifier learns to distinguish patterns between the two. (Check out the first video on this page for more on this concept.) Usually, you want thousands or even millions of examples in both categories. But when it comes to Facebook posts that contain suicidal expressions, the team had, thankfully, relatively few positive examples to work from. That sparse data set was dwarfed by the negative examples — that is, the entire universe of Facebook text posts that were not suicidal expressions. This imbalance only made the context challenge harder.

The team’s big breakthrough was the realization that they had a smaller and more precise set of negative examples: the set of Facebook posts that people had flagged as potentially containing thoughts of suicide, but which the trained Community Operations reviewers determined did not demonstrate a person at risk of committing self-harm. This set of negative examples contained a lot of the “I have so much homework I want to kill myself” type, which led to more precise training of the classifiers on accurate suicidal expressions.

“The smaller data set helped us develop a much more nuanced understanding of what is a suicidal pattern and what isn’t,” says Dan Muriello, an engineer on the team that produced the tools.

The text of the post is only one factor the algorithm examines to determine whether a post should be flagged for review. It also looks at comments left on the post. Here, too, there is linguistic nuance to consider — posts that reviewers determined were serious cases of people in imminent harm tended to have comments like, “Tell me where you are” or “Has anyone heard from him/her?” while potentially less-urgent posts had comments more along the lines of “Call anytime” or “I’m here for you.” Day and time of the original posting are factors, as well, as experts say the early hours of the morning and Sundays, when the workweek looms, can be common times for contemplating suicide.

Even with the introduction of these AI-fueled detection efforts, people are still core to Facebook’s success around suicide prevention. That’s why anyone who flags a potential cry for help is shown support options, including resources for help and ways to connect with loved ones. And whether a post is reported by a concerned friend or family member or identified via machine learning, the next steps in the process remain the same. A trained member of Facebook’s Community Operations team reviews it to determine if the person is at risk — and if so, the original poster is shown support options, such as prompts to reach out to a friend and help-line phone numbers. In serious cases, when it’s determined that there may be imminent danger of self harm, Facebook may contact local authorities. Since these efforts began last year, we’ve worked with first responders on over 1,000 wellness checks based on reports we’ve received from our proactive detection efforts.

Technology can’t replace people in the process, but it can be an aid to connect more people in need with compassionate help.

“We’re not doctors, and we’re not trying to make a mental health diagnosis,” says Muriello. “We’re trying to get information to the right people quickly.”

(1) Various combinations of words can have positive or negative impact on the classifier’s confidence that the content includes a suicidal thought.
(2) The classifiers score the content based on how closely correlated the main text and comments are to previously confirmed cases of suicidal expression.
(3) The classifier scores, combined with other numerical details (e.g. time of day, day of week) are inputted into a “random forest learning algorithm,” a type of machine learning that specializes in numerical data. It uses many decision trees and outputs the mean prediction of the individual trees.

For more information: