Components show

Name:
ApproximativeJoin

Description:
Receives data through two input ports, for each matching key value contained in driver data records (port 0) searches corresponding slave data records (port 1) with the same key value. For every pair of driver and slave records conformity based on join key is computed as Levenstein distance. Pairs with conformity higher than specified value are joined to form outgoing data flow and these resulting records are sent to the first output port. Pairs with conformity less are joined to form outgoing data flow and these resulting records sent to the second output port. Driver records without slave are sent to the third output port if connected. Slave records without driver are sent to the fourth output port if connected.

www.arabidopsis.org

Flowering plant model genome database

The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, metabolism, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every two weeks from the latest published research literature and community data submissions. Gene structures are updated 1-2 times per year using computational and manual methods as well as community submissions of new and updated genes. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources. The Arabidopsis Biological Resource Center at The Ohio State University collects, reproduces, preserves and distributes seed and DNA resources of Arabidopsis thaliana and related species. Stock information and ordering for the ABRC are fully integrated into TAIR.

Project status: in production
Purpose: non-commercial
Economy sector: private
Kick-off date: 1999-01
Go-life date: 1999-11
Length in months: 27
Company:
Carnegie Institution of Washington
United States
http://www.arabidopsis.org
bmuller@stanford.edu

Number of project team members:
analysts: 15
ETL developers: 2
others: 0

Clover Usage

CloverETL framework usage:Standalone-runGraph util & XML graph definition, Embedded-framework's classes in my application
Main reason for Clover usage:Data transfer between operational systems and data warehouse
CloverEngine version:2.1
CloverGUI version:1.6
Number of different data sources:5
Transformed data volume (MB):70000.00
Clover usage experience (years):0.50
Number of transformations (mappings):30

Technology

Operation system:Linux, Windows
System version:RHEL, Solaris 10, Windows XP
Number of CPUs in system:2
Other used systems:
System type:32
Java Virtual Machine:Sun JVM 1.5
Database:Sybase, MySQL

Your account

Login:

Password:
Register