(EN)
Methods and systems are provided for identifying drug compounds for targeting proteins in tissue cells. Such a method includes providing a neural network model which comprises an attention-based protein encoder and a molecular decoder. The protein encoder is pretrained in an autoencoder architecture to encode an input protein sequence into an output vector in a latent space representing proteins. The molecular decoder is pretrained in an autoencoder architecture to generate compound data, defining a compound molecule, from an input vector in a latent space representing molecules. The protein encoder and molecular decoder are coupled such that the input vector of the molecular decoder is dependent on the output vector of the protein encoder for an input protein sequence.