Click here to flash read.
Knowledge distillation (KD) has shown to be effective to boost the
performance of graph neural networks (GNNs), where the typical objective is to
distill knowledge from a deeper teacher GNN into a shallower student GNN.
However, it is often quite challenging to train a satisfactory deeper GNN due
to the well-known over-parametrized and over-smoothing issues, leading to
invalid knowledge transfer in practical applications. In this paper, we propose
the first Free-direction Knowledge Distillation framework via reinforcement
learning for GNNs, called FreeKD, which is no longer required to provide a
deeper well-optimized teacher GNN. Our core idea is to collaboratively learn
two shallower GNNs to exchange knowledge between them. As we observe that one
typical GNN model often exhibits better and worse performances at different
nodes during training, we devise a dynamic and free-direction knowledge
transfer strategy that involves two levels of actions: 1) node-level action
determines the directions of knowledge transfer between the corresponding nodes
of two networks; and then 2) structure-level action determines which of the
local structures generated by the node-level actions to be propagated.
Additionally, considering that different augmented graphs can potentially
capture distinct perspectives of the graph data, we propose FreeKD-Prompt that
learns undistorted and diverse augmentations based on prompt learning for
exchanging varied knowledge. Furthermore, instead of confining knowledge
exchange within two GNNs, we develop FreeKD++ to enable free-direction
knowledge transfer among multiple GNNs. Extensive experiments on five benchmark
datasets demonstrate our approaches outperform the base GNNs in a large margin.
More surprisingly, our FreeKD has comparable or even better performance than
traditional KD algorithms that distill knowledge from a deeper and stronger
teacher GNN.
No creative common's license