bcfind is a tool for fully automated localization of soma in 3D mouse brain images acquired by confocal light sheet microscopy. The core technique in this method is supervised semantic deconvolution which uses a neural network to map a 3D image into a synthetic image where the visibility of specific entities of interest in the image (neural somata in this case) is enhanced and standardized.
Type Extension Trees are a powerful representation language for describing count-of-count features in relational domains. These features characterize the combinatorial structure of relational neighborohoods, are more powerful than simple count statistics, and can be used to develop relational learning algorithms. In our new AIJ paper on type extension trees we present a structure learning algorithm for constructing TET from data, and take advantage of the Earth Mover's Distance to develop a supervised instance-based relational learner. We report experiments on bibliographic data show that TET-learning is able to discover the count-of-count feature underlying the definition of the h-index, and the inverse document frequency feature commonly used in information retrieval.
kLog is a logical and relational language for kernel-based learning embedded in Prolog. It allows users to specify logical and relational learning problems at a high level, in a declarative way. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming and deductive databases, and graph kernels. kLog can use numerical and symbolic data, background knowledge in the form of Prolog or Datalog programs (as in inductive logic programming systems) and several statistical procedures can be used to fit the model parameters.
Besides the two systems described above, we have developed other algorithms for learning in relational domains. Relational information gain is a novel refinement scoring function measuring the informativeness of newly introduced variables in ILP systems. We have also integrated kernel methods in relational learning systems in various way. kFOIL is a a dynamic propositionalization technique which uses a logical kernel and constructs clauses by leveraging FOIL search and using SVM performance to guide the search. Kernels on Prolog proof trees are based derived from background knowledge expressed in first-order logic.
MetalDetector identifies cysteines and histidines involved in transition metal protein binding sites, starting from the protein sequence alone. The prediction server is available at metaldetector.dsi.unifi.it.
Source code is also available.
Disulfind predicts disulfide bridges in proteins, starting from protein sequence alone. The prediction server is available atdisulfind.dsi.unifi.it. If you are on Debian or Ubuntu you may install a standalone version:
sudo apt-get install disulfinder
Both Metaldetector and Disuldind are also integrated in PredictProtein.
MLOCSR converts bitmap images of chemical structural formulae into machine readable vector formats (such as MOL and SDF).
We have developed graph kernels predicting the activity of small molecules using 2D and 3D representations. Source code written by Fabrizio Costa and Alessio Ceroni for the original method (which includes a 3D kernel) is available here. For the 2D case a more recent and significantly evolved approach is EDeN developed at the University of Freiburg.
We have applied several machine learning methods to the prediction of protein structure. Source code of a Beta-residue contacts predictor, based on neural networks and Markov Logic, (written by Marco Lippi) is available for download. The bidirectional RNN described in Baldi et al (1999) was originally developed by G. Pollastri in our lab and has subsequently evolved into the state-of-the-art systems SSpro and PORTER.