
We’re excited to bring back Transform 2022 in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Register today!
Graphs are, quite simply, a universal method of representing relationships between entities – starting with immediate connections, then “jumping” to connections from connections to connections. The farther they go, the wider the tree gets.
To make sense of this, graphical neural networks (GNN) are often applied. These deep learning models are specialized to understand graphs.
Yet, when it comes to social media today, GNNs are less than optimal. When applied to determine the connections between friends, acquaintances and professional colleagues, they often cannot calculate the complex nuances and degrees of relationships. This makes it difficult for platforms like LinkedIn, Twitter, Facebook and Instagram to make accurate recommendations – a task that is central to their mission.
LinkedIn takes a PASS at GNN
To overcome these inherent challenges of GNNs and improve its recommendation capabilities, LinkedIn created a process it calls Performance-Adaptive Sampling Strategy (PASS). This uses AI to select neighbors in graphs that are most relevant, improving predictive accuracy.
After applying the new GNN model to its own recommendation engines, the professional networking platform has just released PASS for the open source community.
“We want to compare our methods to other researchers’ datasets,” said Jaewon Yang, senior software engineer at LinkedIn, who led PASS. “We hope they can rely on our networks.”
At a high level, LinkedIn uses GNNs to understand the relationships between individual members, groups, skills, and interests — at the primary, secondary, tertiary levels, and beyond — to help inform recommendations.
According to its creators, PASS’s unique neighbor-selection AI model focuses on this by deciding whether or not to select a given neighbor by examining its attributes. It can also help detect if one of those neighbors is actually a bot or a fake account by determining the authenticity of their connections. This adaptive model learns to select neighbors that improve its accuracy.
“Sometimes people miss other titles that may be very relevant to a job posting or other recommendation,” Yang said. “We want to understand precisely who is following this user, we want to understand which other users are following users.”
Traditional GNNs can be difficult to adapt to social networks because they present many potential relationships, not all of which are relevant to given tasks, Yang said. For example, a member’s connections may be personal friends working in different fields, which decreases the accuracy of recommendations.
Meanwhile, a top influencer or public figure could have hundreds of millions of connections – flouting sociological “Dunbar’s number” theory that a person can only have a certain number of friends, pointed out Yang – and it is impossible to calculate them. everything.
“These present an explosive number of data points that need to be considered,” he said. “We can’t consider them all, we have to sample a few.”
Some existing methods have attempted to overcome scaling challenges by sampling a fixed number of “neighbors”, thereby reducing entries into the GNN. But such samplers aren’t fully representative, Yang said, and don’t consider which neighbors might prove most relevant.
IBM and Yale platforms boost GNNs
Other organizations are deploying similar platforms that attempt to boost existing GNNs. For example, Yale University and IBM recently proposed a concept they call Kernel Graph Neural Networks (KerGNN), which integrates graph kernels into GNN message transmission. It is the process by which vector messages are exchanged between nodes in a graph and then updated. According to the Yale researchers, the use of this KerGNN method has improved the interpretability of the model compared to conventional GNNs.
Similarly, Google released TensorFlow Graph Neural Networks, a library designed to make it easier to work with graph-structured data in its TensorFlow machine learning (ML) framework. Twitter, Pinterest, Airbnb and others are also researching and posting tools to help combat GNN limitations.
PASS has been shown to achieve higher prediction accuracy even though it uses fewer neighbors than other GNN models. In experiments on seven public benchmark charts and two LinkedIn charts, PASS outperformed GNN methods by up to 10.4%. It also showed up to 3x higher accuracy compared to reference methods by adding so-called “noisy edges”.
In open source PASS, the hope is that other researchers will find new ways to apply the platform, Yang said, and thus make it more expressive, flexible, easier to model, and address its limitations. to continually expand its use for a variety of applications. .
“This technology is moving very quickly,” said Romer Rosales, senior director of AI at LinkedIn. “We’re only scratching the surface in terms of all the uses it can have. There is plenty of room for us to grow, and for the community in general to grow in this space.
LinkedIn researchers will continue to refine PASS to tackle increasingly large datasets without losing expressive power, he said. The goal is to eventually automate some processes that still require human sourcing, such as specifying parameters on how to sample hops and whether the system should identify two hops, three hops, or further down the chain.
“It’s fertile ground to try out these new ideas,” Rosales said. “We hope other communities will also join us, and we will join other communities in trying and sharing these experiences.”
PASS is one of several that LinkedIn has made open source this year, he pointed out. Another is FastTreeSHAP, a package for the Python programming language. This allows algorithm results to be more effectively interpreted to improve AI transparency, including explainable AI to build trust and increase decision-making – such as business forecasting, recruiter search and recruitment models. job search. It also helps modelers with debugging and general improvements.
Another project is Feather, a feature store that makes it easier to manage ML features at scale and improve developer productivity. Dozens of apps use the feature store to define features, compute them for training, deploy them to production, and share them across teams. Feathr users have reported a significant reduction in the time it takes to add new features to model training workflows and improved runtime performance over previous application-specific feature pipeline tools.
“PASS is one example of a long series of AI projects that we have opened up to the community,” Rosales said, “with the aim of sharing our experience and helping to create AI algorithms and tools. more scalable, expressive and responsible”.
VentureBeat’s mission is to be a digital public square for technical decision makers to learn about transformative enterprise technology and conduct transactions. Learn more about membership.