Certains contenus de cette application ne sont pas disponibles pour le moment.
Si cette situation persiste, veuillez nous contacter àObservations et contact
1. (WO2019002536) NOUVEAUX BIOMARQUEURS À BASE D'ARNMI ET LEUR UTILISATION
Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

NOVEL MIRNA BIOMARKERS AND USE THEREOF

The present invention relates to novel nucleic acid molecules (novel miRNAs) as well as vectors, host cells, primers, cDNA-transcripts, polynucleotides derived from said nucleic acid molecules and their use in diagnosis, prognosis, and/or therapy. In addition, the present invention relates to methods, means, and kits for diagnosing and/or prognosing a disease for detecting said novel nucleic acid molecules (novel miRNAs molecules).

BACKGROUND OF THE INVENTION

MicroRNAs (miRNAs) are a new class of biomarkers. They represent a group of small noncoding RNAs that regulate gene expression at the posttranslational level by degrading or blocking translation of messenger RNA (mRNA) targets. MiRNAs are important players when it comes to regulate cellular functions and in several diseases, including cancer or neurodegenerative diseases.

So far, miRNAs have been extensively studied in tissue material. It has been found that miRNAs are expressed in a highly tissue-specific manner. Disease-specific expression of miRNAs have been reported in many human cancers employing primarily tissue material as the miRNA source. Since recently it is known that miRNAs are not only present in tissues but also in other body fluid samples, including human blood.

In order to improve the biomarker capabilities in diagnosis and/or prognosis, there is a constant need for disease specific, well-performing biomarkers such as miRNA biomarkers. The inventors of the present invention isolated and characterized novel miRNAs from biological samples. Said novel miRNAs are suitable for use in the diagnosis and/or prognosis of diseases or conditions which are lung, kidney, brain, testis, and/or heart related, or diseases or conditions which are associated with an impaired immune system, cancer or the cardiovascular system.

SUMMARY OF THE INVENTION

In a first aspect, the present invention relates to a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

In a second aspect, the present invention relates to a nucleic molecule that is a complement to the nucleic acid molecule according to the first aspect.

In a third aspect, the present invention relates to a vector comprising the nucleic acid molecule according to the first or second aspect.

In a fourth aspect, the present invention relates to a host cell comprising the nucleic acid molecule according to the first or second aspect.

In a fifth aspect, the present invention relates to a host cell comprising the vector according to the third aspect.

In a sixth aspect, the present invention relates to a primer for reverse transcribing the nucleic acid molecule according to the first aspect.

In a seventh aspect, the present invention relates to a cDNA-transcript of the nucleic acid molecule according to the first aspect.

In an eighth aspect, the present invention relates to a set of primer pairs for amplifying the cDNA-transcript according to the seventh aspect.

In a ninth aspect, the present invention relates to a polynucleotide for detecting the nucleic acid molecule of the first aspect.

In a tenth aspect, the present invention relates to a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, a primer according to the sixth aspect, a cDNA-transcript according to the seventh aspect, a set of primer pairs according to the eight aspect, and/or a polynucleotide according to the ninth aspect for use in diagnosing and/or prognosing a disease.

Alternatively, the present invention relates to the (in vitro) use of a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, a primer according to the sixth aspect, a cDNA-transcript according to the seventh aspect, a set of primer pairs according to the eight aspect, and/or a polynucleotide according to the ninth aspect for diagnosing and/or prognosing a disease.

In an eleventh aspect, the present invention relates to a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, and/or a polynucleotide according to the ninth aspect for use as a medicament.

Alternatively, the present invention relates to the (in vitro) use of a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, and/or a polynucleotide according to the ninth aspect for therapeutic intervention (therapy).

In a twelfth aspect, the present invention relates to a method for diagnosing and/or prognosing a disease in a patient comprising the steps of:

(i) determining the level of at least one nucleic acid molecule in a biological sample isolated from the patient, and

(ii) comparing the level to a reference level, wherein the comparison of said level to said reference level allows for the diagnosis and/or prognosis of the disease,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

In a thirteenth aspect, the present invention relates to means for determining the level of at least one nucleic acid molecule comprising:

(a) at least one polynucleotide according to the ninth aspect,

(b) at least one primer according to the sixth aspect, and/or

(c) at least one set of primer pairs according to the eight aspect,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

In a fourteenth aspect, the present invention relates to means for determining the level of at least one nucleic acid molecule comprising:

a microarray, a RT-PCT system, a PCR-system, a flow cytometer, a Luminex system and/or a next generation sequencing system,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

In a fifteenth aspect, the present invention relates to a kit for diagnosing and/or prognosing a disease comprising:

(a) means for determining the level of at least one nucleic acid molecule according to the thirteenth or fourteenth aspect, and

(b) at least one reference level,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

In a sixteenth aspect, the present invention relates to a matrix comprising at least one polynucleotide which comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 and SEQ ID NO: 1521 to SEQ ID NO: 2280.

This summary of the invention does not necessarily describe all features of the invention. Other embodiments will become apparent from a review of the ensuing detailed description.

DETAILED DESCRIPTION OF THE INVENTION

Definitions

Before the present invention is described in detail below, it is to be understood that this invention is not limited to the particular methodology, protocols and reagents described herein as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art.

In the following, the elements of the present invention will be described. These elements are listed with specific embodiments, however, it should be understood that they may be combined in any manner and in any number to create additional embodiments. The variously described examples and preferred embodiments should not be construed to limit the present invention to only the explicitly described embodiments. This description should be understood to support and encompass embodiments which combine the explicitly described embodiments with any number of the disclosed and/or preferred elements. Furthermore, any permutations and combinations of all described elements in this application should be considered disclosed by the description of the present application unless the context indicates otherwise.

Preferably, the terms used herein are defined as described in "A multilingual glossary of biotechno logical terms: (IUPAC Recommendations)", H.G.W. Leuenberger, B. Nagel, and H. Kolbl, Eds., Helvetica Chimica Acta, CH-4010 Basel, Switzerland, (1995).

To practice the present invention, unless otherwise indicated, conventional methods of chemistry, biochemistry, and recombinant DNA techniques are employed which are explained in the literature in the field (cf., e.g., Molecular Cloning: A Laboratory Manual, 2nd Edition, J. Sambrook et al. eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor 1989).

Several documents are cited throughout the text of this specification. Each of the documents cited herein (including all patents, patent applications, scientific publications, manufacturer's specifications, instructions, etc.), whether supra or infra, are hereby incorporated by reference in their entirety. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

As used in this specification and in the appended claims, the singular forms "a", "an", and "the" include plural referents, unless the content clearly dictates otherwise.

The terms "microRNA" or "miRNA" refer to single-stranded RNA molecules of at least 10 nucleotides and of not more than 35 nucleotides covalently linked together. Preferably, the polynucleotides of the present invention are molecules of 10 to 33 nucleotides or 15 to 30 nucleotides in length, more preferably of 17 to 27 nucleotides or 18 to 26 nucleotides in length, i.e. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides in length, not including optionally labels and/or elongated sequences (e.g. biotin stretches). The miRNAs regulate gene expression and are encoded by genes from whose DNA they are transcribed but miRNAs are not translated into protein (i.e. miRNAs are non-coding RNAs). The genes encoding miRNAs are longer than the processed mature miRNA molecules. The miRNAs are first transcribed as primary transcripts or pri-miRNAs with a cap and poly-A tail and processed to short, 70 nucleotide stem-loop structures known as pre-miRNAs in the cell nucleus. This processing is performed in animals by a protein complex known as the Microprocessor complex consisting of the nuclease Drosha and the double-stranded RNA binding protein Pasha. These pre-miRNAs are then processed to mature miRNAs in the cytoplasm by interaction with the endonuclease Dicer, which also initiates the formation of the RNA- induced silencing complex (RISC). When Dicer cleaves the pre-miRNA stem-loop, two complementary short RNA molecules are formed, but only one is integrated into the RISC. This strand is known as the guide strand and is selected by the argonaute protein, the catalytically active RNase in the RISC, on the basis of the stability of the 5' end. The remaining strand, known as the miRNA*, anti-guide (anti-strand), or passenger strand, is degraded as a RISC substrate. Therefore, the miRNA*s are derived from the same hairpin structure like the "normal" miRNAs. So if the "normal" miRNA is then later called the "mature miRNA" or "guide strand", the miRNA* is the "anti-guide strand" or "passenger strand".

The terms "microRNA*" or "miRNA*" refer to single-stranded RNA molecules of at least 10 nucleotides and of not more than 35 nucleotides covalently linked together. Preferably, the polynucleotides of the present invention are molecules of 10 to 33 nucleotides or 15 to 30 nucleotides in length, more preferably of 17 to 27 nucleotides or 18 to 26 nucleotides in length, i.e. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides in length, not including optionally labels and/or elongated sequences (e.g. biotin stretches). The "miRNA*s", also known as the "anti-guide strands" or "passenger strands", are mostly complementary to the "mature miRNAs" or "guide strands", but have usually single-stranded overhangs on each end. There are usually one or more mispairs and there are sometimes extra or missing bases causing single-stranded "bubbles". The miRNA*s are likely to act in a

regulatory fashion as the miRNAs (see also above). In the context of the present invention, the terms "miRNA" and "miRNA*" are interchangeable used. The present invention encompasses (target) miRNAs which are dysregulated in biological samples such as blood of a diseased subject, preferably a AD and/or a MS subject in comparison to healthy controls. Said (target) miRNAs are preferably selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

The term "miRBase" refers to a well established repository of validated miRNAs. The miRBase (www.mirbase.org) is a searchable database of published miRNA sequences and annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching and browsing, and entries can also be retrieved by name, keyword, references and annotation. All sequence and annotation data are also available for download.

As used herein, the term "nucleotides" refers to structural components, or building blocks, of DNA and RNA. Nucleotides consist of a base (one of four chemicals: adenine, thymine, guanine, and cytosine) plus a molecule of sugar and one of phosphoric acid. The term "nucleosides" refers to glycosylamine consisting of a nucleobase (often referred to simply base) bound to a ribose or deoxyribose sugar. Examples of nucleosides include cytidine, uridine, adenosine, guanosine, thymidine and inosine. Nucleosides can be phosphorylated by specific kinases in the cell on the sugar's primary alcohol group (-CH2-OH), producing nucleotides, which are the molecular building blocks of DNA and RNA.

The term "polynucleotide", as used herein, means a molecule of at least 10 nucleotides and of not more than 80 nucleotides covalently linked together. Preferably, the polynucleotides of the present invention are molecules of 10 to 70 nucleotides or 15 to 68 nucleotides in length, , i.e. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 ,40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67 or 68 nucleotides in length, not including optionally spacer elements and/or elongation elements described below. The depiction of a single strand of a polynucleotide also defines the sequence of the complementary strand. Polynucleotides may be single stranded or double stranded, or may contain portions of both double stranded and single stranded sequences. The term "polynucleotide" means a polymer of deoxyribonucleotide or ribonucleotide bases and includes DNA and RNA molecules, both sense and anti-sense strands. In detail, the polynucleotide may be DNA, both cDNA and genomic DNA, RNA, cRNA or a hybrid, where the polynucleotide sequence may contain combinations of deoxyribonucleotide or ribonucleotide bases, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine,

hypoxanthine, isocytosine and isoguanine. Polynucleotides may be obtained by chemical synthesis methods or by recombinant methods.

In the context of the present invention, a polynucleotide as a single polynucleotide strand provides a probe (e.g. miRNA capture probe) that is capable of binding to, hybridizing with, or detecting a target of complementary sequence, such as a nucleotide sequence of a miRNA or miRNA*, through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Polynucleotides in their function as probes may bind target sequences, such as nucleotide sequences of miRNAs or miRNAs*, lacking complete complementarity with the polynucleotide sequences depending upon the stringency of the hybridization condition. There may be any number of base pair mismatches which will interfere with hybridization between the target sequence, such as a nucleotide sequence of a miRNA or miRNA*, and the single stranded polynucleotide described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent hybridization conditions, the sequences are no complementary sequences. The present invention encompasses polynucleotides in form of single polynucleotide strands as probes for binding to, hybridizing with or detecting complementary sequences of (target) miRNAs, that may be used in diagnosing and/or prognosing of a disease, preferably MS or AD. Said (target) miRNAs are preferably selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

The term "nucleic acid molecule", as used herein, refers to a DNA or RNA molecule, preferably a miRNA molecule. The nucleic acid molecule of the present invention comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

The term "complement of a nucleic acid molecule", as used herein, refers to sequences that are complementary to the nucleotide sequence of the nucleic acid molecule described herein. The nucleic acid molecule of the present invention comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. In the context of the present invention, the terms "complement of a nucleic acid molecule" and "reverse complement of a nucleic acid molecule" are interchangeable used. Said terms include both complementary (and reverse complementary) DNA- and RNA-sequences.

The term "biological sample", as used herein, refers to any biological sample from a patient or (control) subject comprising the nucleic acid molecule according to the present invention. The biological sample may be a tissue or a body fluid sample. For example, biological samples encompassed by the present invention are tissue samples (tissues including, but not limited to, the of the lung ,heart, brain , breast, skin, kidney, colon, prostate, pancreas, stomach, eye), blood (e.g. whole blood or blood fraction such as blood cell fraction, serum or plasma) samples, urine samples, or samples from other peripheral sources. Said biological samples may be mixed or pooled, e.g. a sample may be a mixture of a blood sample and a urine sample. Said biological samples may be provided by removing a tissue or a body fluid from a patient or (control) subject, but may also be provided by using a previously isolated sample. For example, a blood sample may be taken from a patient or (control) subject by conventional blood collection techniques. The biological sample, e.g. tissue sample, urine sample or blood sample, may be obtained from a patient or (control) subject prior to the initiation of a therapeutic treatment, during the therapeutic treatment, and/or after the therapeutic treatment. If the biological sample is obtained from at least one (control) subject, e.g. from at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, 500, or 1.000 (control) subject(s), it is designated as "reference biological sample". Preferably, the reference biological sample is from the same source than the biological sample of the patient to be tested, e.g. both are blood samples or urine samples. It is further preferred that both are from the same species, e.g. from a human. It is also (alternatively or additionally) preferred that the measurements of the reference biological sample and the biological sample of the patient to be tested are identical, e.g. both have an identical volume. It is particularly preferred that the reference biological sample and the biological sample are from patients/(control) subjects of the same sex and similar age, e.g. no more than 2 years apart from each other.

The term "body fluid sample", as used herein, refers to any liquid sample derived from the body of a patient or (control) subject comprising the nucleic acid molecule according to the present invention.

Said body fluid sample may be a urine sample, blood sample, sputum sample, breast milk sample, cerebrospinal fluid (CSF) sample, cerumen (earwax) sample, gastric juice sample, mucus sample, endo lymph fluid sample, perilymph fluid sample, peritoneal fluid sample, pleural fluid sample, saliva sample, sebum (skin oil) sample, semen sample, sweat sample, tears sample, cheek swab, vaginal secretion sample, liquid biopsy, or vomit sample including components or fractions thereof. The term "body fluid sample" also encompasses body fluid fractions, e.g. blood fractions, urine fractions or sputum fractions. The body fluid samples may be mixed or pooled. Thus, a body fluid sample may be a mixture of a blood and a urine sample or a mixture of a blood and cerebrospinal fluid sample. Said body fluid sample may be provided by removing a body liquid from a patient or (control) subject, but may also be provided by using previously isolated body fluid sample material. The body fluid sample allows for a non-invasive analysis of a patient. It is further preferred that the body fluid sample has a volume of between 0.01 and 20 ml, more preferably of between 0.1 and 10 ml, even more preferably of between 0.5 and 8 ml, and most preferably of between 1 and 5 ml. If the body fluid sample is obtained from at least one (control

subject), e.g. from at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, 500, or 1.000 control subject(s), it is designated as "reference body fluid sample".

The term "blood sample", as used in the context of the present invention, refers to a blood sample originating from a subject. The "blood sample" may be derived by removing blood from a subject by conventional blood collecting techniques, but may also be provided by using previously isolated and/or stored blood samples. For example a blood sample may be whole blood, plasma, serum, PBMC (peripheral blood mononuclear cells), blood cellular fractions including red blood cells (erythrocytes), white blood cells (leukocytes), platelets (thrombocytes), or blood collected in blood collection tubes (e.g. EDTA-, heparin-, citrate-, PAXgene- , Tempus-tubes) including components or fractions thereof. For example, a blood sample may be taken from a subject suspected to be affected or to be suspected to be affected by a disease prior to initiation of a therapeutic treatment, during the therapeutic treatment and/or after the therapeutic treatment. Preferably, the blood sample from a subject (e.g. human or animal) has a volume of between 0.1 and 20 ml, more preferably of between 0.5 and 10 ml, more preferably between 1 and 8 ml and most preferably between 2 and 5 ml, i.e. 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 ml.

Preferably, when the blood sample is collected from the subject the RNA- fraction, especially the the miRNA fraction, is guarded against degradation. For this purpose special collection tubes (e.g.

PAXgene RNA tubes from Preanalytix, Tempus Blood RNA tubes from Applied Biosystems) already including additives or additives that are added separately to the blood sample (e.g.

RNAlater from Ambion, RNAsin from Promega) that stabilize the RNA fraction and/or the miRNA fraction are employed.

The term "biomarker", as used in the context of the present invention, represents a characteristic that can be objectively measured and evaluated as an indicator of normal and disease processes or pharmacological responses. A biomarker is a parameter that can be used to measure the onset or the progress of disease or the effects of treatment. The parameter can be chemical, physical or biological.

The term "diagnosis", as used in the context of the present invention, refers to a process of determining a disease or disorder in a patient. It is, therefore, a process attempting to define the (clinical) condition of a patient. The level of the nucleic acid molecule according to the present invention determined with the method according to the present invention correlates with the (clinical) condition of a patient. Preferably, the diagnosis comprises (i) determining the occurrence/presence of a disease, (ii) monitoring the course of a disease, (iii) staging of a disease, (iv) measuring the response of a patient with a disease to therapeutic intervention, and/or (v) segmentation of a patient suffering from a disease.

The term "prognosis", as used in the context of the present invention, refers to a process of describing the likelihood of the outcome or course of a disease or a disorder. Preferably, the prognosis comprises (i) identifying a subject/patient who has a risk to develop a disease, (ii) predicting/estimating the occurrence, preferably the severity of occurrence of a disease, and/or (iii) predicting the response of a patient with a disease to therapeutic intervention.

The disease or disorder may be a lung, kidney, brain, testis, and/or heart related disease or disorder, or a disease or disorder associated with an impaired immune system, cancer or the cardiovascular system.

The term "treatment", in particular "therapeutic treatment", as used herein, refers to any therapy which improves the health status and/or prolongs (increases) the lifespan of a patient. Said therapy may eliminate the disease in a patient, arrest or slow the development of a disease in a patient, inhibit or slow the development of a disease in a patient, decrease the frequency or severity of symptoms in a patient, and/or decrease the recurrence in a patient who currently has or who previously has had a disease. The treatment of a disease or condition may include, but is not limited to, surgery, chemotherapy, radiotherapy, administration of a drug, exercise training, mental training, and/or physical rehabilitation.

The term "patient", as used herein, refers to any subject for whom it is desired to know whether she or he suffers from a disease or condition. In particular, the term "patient", as used herein, refers to a subject suspected to be affected by a disease or condition. The patient may be diagnosed to be affected by the disease or condition or may be diagnosed to be not affected by the disease or condition, i.e. healthy. The term "patient", as used herein, also refers to a subject which is affected by a disease or condition. The patient may be retested for the disease or condition and may be diagnosed to be still affected by the disease or condition or not affected by the disease or condition anymore, i.e. healthy, for example after therapeutic intervention. The patient may also be retested for the disease or condition and may be diagnosed as having developed an advanced form of the disease or condition.

It should be noted that a patient that is diagnosed as being healthy, i.e. not suffering from the disease or condition, may possibly suffer from another disease not tested/known.

The patient may be any mammal, including both a human and another mammal, e.g. an animal such as a rabbit, mouse, rat, or monkey. Human patients are particularly preferred.

The term "(control) subject", as used herein, refers to a subject known to be affected by a disease or condition. Said (control) subject may also have developed an advanced form of a disease or condition. The term "(control) subject", as used herein, also refers to a subject known to be not affected by a disease or condition (negative control), i.e. healthy. Thus, the term "healthy subject", as used herein, means a subject which is known to be not affected by a disease or condition. However, the (control) subject may suffer from another disease or condition not tested/known. The (control) subject may be any mammal, including both a human and another mammal, e.g. an animal such as a rabbit, mouse, rat, or monkey. Human (control) subjects are particularly preferred.

The term "nucleic acid molecule level," as used herein, refers to the level of the nucleic acid molecule in a biological sample, thus represents a quantitative measure of said nucleic acid molecule in said sample. The nucleic acid molecule level may be generated by any convenient means, e.g. nucleic acid hybridization (e.g. to a microarray, bead-based methods), nucleic acid amplification (PCR, RT-PCR, qRT-PCR, high-throughput RT-PCR), ELISA for quantitation, next generation sequencing (e.g. ABI SOLID, Illumina Genome Analyzer, Roche/454 GS FLX), flow cytometry (e.g. LUMINEX) and the like, that allow the analysis of differential nucleic acid molecule levels between samples of a subject (e.g. diseased) and a control subject (e.g. healthy, reference sample). The sample material measure by the aforementioned means may be total RNA, labeled total RNA, amplified total RNA, cDNA, labeled cDNA, amplified cDNA, miRNA, labeled miRNA, amplified miRNA or any derivatives that may be generated from the aforementioned RNA/DNA species. By determining the nucleic acid molecule level, each nucleic acid molecule is represented by a numerical value. The higher the value of an individual nucleic acid molecule, the higher is the level of said nucleic acid molecule, or the lower the value of an individual nucleic acid molecule, the lower is the level of said nucleic acid molecule.

The term "level", as used herein, refers to an amount (measured for example in grams, mole, or (fluorescence) counts) or concentration (e.g. absolute or relative concentration) of at least one nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. The term "level", as used herein, also comprises scaled, normalized, or scaled and normalized amounts or values. The level may also be a cut-off level. Preferably, the level is the expression level.

The term "differential expression" of a nucleic acid molecule, as used herein, refers to a qualitative and/or quantitative difference in the temporal and/or local nucleic acid molecule expression pattern, e.g. within and/or among biological samples, body fluid samples, cells, or within blood. Thus, a differentially expressed nucleic acid molecule may qualitatively have its expression altered, including an activation or inactivation in, for example, blood from a diseases subject versus blood from a healthy subject. The difference in nucleic acid molecule expression may also be quantitative, e.g. in that expression is modulated, i.e. either up-regulated, resulting in an increased amount of the nucleic acid molecule, or down-regulated, resulting in a decreased amount of the nucleic acid molecule. The degree to which nucleic acid molecule expression differs need only be large enough to be quantified via standard expression characterization techniques,

e.g. by quantitative hybridization (e.g. to a microarray, to beads), amplification (PCR, RT-PCR, qRT-PCR, high-throughput RT-PCR), ELISA for quantitation, next generation sequencing (e.g. ABI SOLID, Illumina Genome Analyzer, Roche 454 GS FL), flow cytometry (e.g. LUMINEX) and the like.

Nucleic acid hybridization may be performed using a microarray/biochip or in situ hybridization. In situ hybridization is preferred for the analysis of a single nucleic acid molecule or a set comprising a low number of nucleic acid molecules (e.g. a set of at least 2 to 50 nucleic acid molecules such as a set of 2, 5, 10, 20, 30, or 40 nucleic acid molecules). The microarray/biochip, however, allows the analysis of a single nucleic acid molecule as well as a complex set of nucleic acid molecules (e.g. all known nucleic acid molecules or subsets thereof). For nucleic acid hybridization, for example, the polynucleotides (probes, preferably DNA-probes) according to the present invention with complementarity to the corresponding nucleic acid molecules to be detected are attached to a matrix to generate a microarray/biochip (e.g. 760 polynucleotides (probes) which are complementary to the 760 nucleic acid molecules comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760). Said microarray/biochip is then incubated with a biological sample containing nucleic acid molecules, isolated (e.g. extracted) from the biological sample from a subject such as a human or an animal, which may be labelled, e.g. fluorescently labelled, or unlabelled. Quantification of the expression level of the nucleic acid molecules may then be carried out e.g. by direct read out of a label or by additional manipulations, e.g. by use of a polymerase reaction (e.g. template directed primer extension, MPEA-Assay, RAKE-assay) or a ligation reaction to incorporate or add labels to the captured nucleic acid molecules.

Alternatively, the polynucleotides which are at least partially complementary (e.g. a set of chimeric polynucleotides with each a first stretch being complementary to a set of nucleic acid molecule sequences and a second stretch complementary to capture probes bound to a matrix surface (e.g. beads, Luminex beads)) to the nucleic acid molecules comprising nucleotide sequences selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 are contacted with the biological sample containing nucleic acid molecules (e.g. a body fluid sample, preferably a blood sample) in solution to hybridize. Afterwards, the hybridized duplexes are pulled down to the surface (e.g a plurality of beads) and successfully captured nucleic acid molecules are quantitatively determined (e.g. FlexmiR-assay, FlexmiR v2 detection assays from Luminex).

Nucleic acid amplification may be performed using real time polymerase chain reaction (RT-PCR) such as real time quantitative polymerase chain reaction (RT qPCR). The standard real time polymerase chain reaction (RT-PCR) is preferred for the analysis of a single nucleic acid molecule or a set comprising a low number of nucleic acid molecules (e.g. a set of at least 2 to 50 nucleic acid molecules such as a set of 2, 5, 10, 20, 30, or 40 nucleic acid molecules), whereas high-throughput RT-PCR technologies (e.g. OpenArray from Applied Biosystems, SmartPCR from Wafergen, Biomark System from Fluidigm) are also able to measure large sets (e.g a set of 10, 20, 30, 50, 80, 100, 200 or more) to all known nucleic acid molecules in a high parallel fashion. RT-PCR is particularly suitable for detecting low abandoned nucleic acid molecules.

The aforesaid real time polymerase chain reaction (RT-PCR) may include the following steps: (i) extracting total R A from a biological sample obtained from a patient, (ii) obtaining cDNA-transcripts by RNA reverse transcription (RT) reaction using universal or nucleic acid molecule-specific RT primers (e.g. stem-loop RT primers); (iii) optionally amplifying the obtained cDNA-transcripts (e.g. by PCR such as a specific target amplification (STA)), (iv) detecting the nucleic acid molecule level in the sample by means of (real time) quantification of the cDNA of step (ii) or (iii) e.g. by real time polymerase chain reaction wherein a fluorescent dye (e.g. SYBR Green) or a fluorescent probe (e.g. Taqman probe) probe are added. In Step (i) the isolation and/or extraction of RNA may be omitted in cases where the RT-PCR is conducted directly from the nucleic acid molecule-containing sample. Kits for determining a nucleic acid molecule level by real time polymerase chain reaction (RT-PCR) are e.g. from Life Technologies, Applied Biosystems, Ambion, Roche, Qiagen, Invitrogen, SABio sciences, Exiqon .

A variety of kits and protocols to determine nucleic acid molecule level by real time polymerase chain reaction (RT-PCR) such as real time quantitative polymerase chain reaction (RT qPCR) are available. For example, reverse transcription of nucleic acid molecules may be performed using the TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems) according to manufacturer's recommendations. Briefly, nucleic acid molecules may be combined with dNTPs, MultiScribe reverse transcriptase and the primer specific for the target nucleic acid molecules. The resulting cDNA may be diluted and may be used for PCR reaction. The PCR may be performed according to the manufacturer's recommendation (Applied Biosystems). Briefly, cDNA may be combined with the TaqMan assay specific for the target nucleic acid molecules and PCR reaction may be performed using ABI7300. Alternative kits are available from Ambion, Roche, Qiagen, Invitrogen, SABiosciences, Exiqon etc.

The nucleic acid molecule of the present invention is preferably an isolated nucleic acid molecule. The term "isolated nucleic acid molecule", as used herein, refers to a nucleic acid molecule that is (i) isolated from its natural environment, e.g. tissue or body fluid such as blood or urine, e.g. extracted, (ii) amplified by polymerase chain reaction, and/or (iii) wholly or partially synthesized.

As mentioned above, the nucleic acid molecule of the present invention is a miRNA molecule.

The term "vector", as used herein, refers to a molecule which is used as a vehicle to artificially carry the nucleic acid molecule of the present invention into a cell. The vector may be a plasmid vector or a viral vector. The vector may also be a DNA-vector or RNA-vector.

The term "host cell", as used herein, refers to a host cell that comprises the nucleic acid molecule or the vector according to the present invention. The nucleic acid molecule may be found inside the host cell freely dispersed as such or incorporated into a vector. The term "host cell" includes the progeny of the original cell which has been transformed, transfected, or infected with the nucleic acid molecule or the vector of the present invention. A host cell may be a bacterial cell such as an E. coli cell or a mammalian cell. The host cell which comprises the nucleic acid molecule or the vector of the present invention is also designated as recombinant host cell.

The terms "biochip" or "microarray", as used herein, refer to a matrix, e.g. solid support/phase, comprising an attached or immobilized polynucleotide described herein as probe or polynucleotides described herein attached or immobilized as probes. The polynucleotide probes are capable of hybridizing to a target nucleic acid molecule, such as a complementary miRNA or miRNA* sequence, under stringent hybridization conditions. The polynucleotide probes may be attached or immobilized at spatially defined locations on the matrix, e.g. solid phase. One or more than one polynucleotide (probe) per target nucleic acid sequence may be used. The polynucleotide probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip. The matrix, e.g. solid phase, may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the polynucleotide probes and is amenable to at least one detection method. Representative examples of matrix, e.g. solid phase, materials include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The matrix, e.g. solid phase, may allow optical detection without appreciably fluorescing. The matrix, e.g. solid phase, may be planar, although other configurations of matrix may be used as well. For example, polynucleotide probes may be placed on the inside surface of a tube, for flow- through sample analysis to minimize sample volume. Similarly, the matrix, e.g. solid phase, may be flexible, such as flexible foam, including closed cell foams made of particular plastics. The matrix, e.g. solid phase, of the biochip and the probe may be modified with chemical functional groups for subsequent attachment of the two. For example, the biochip may be modified with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker. The polynucleotide probes may be attached to the matrix, e.g. solid support, by either the 5 ' terminus, 3' terminus, or via an internal nucleotide. The polynucleotide probe may also be attached to the matrix, e.g. solid support, non-covalently. For example, biotinylated polynucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, polynucleotide probes may be synthesized on the surface using techniques such as photopolymerization and photolithography. In the context of the present invention, the terms "biochip" and "microarray" are interchangeable used.

The terms "attached" or "immobilized", as used herein, refer to the binding between the polynucleotide and the matrix, e.g. solid support/phase and may mean that the binding between the polynucleotide probe and the matrix, e.g. solid support/phase is sufficient to be stable under conditions of binding, washing, analysis and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the polynucleotide and the matrix or may be formed by a cross linker or by inclusion of specific reactive groups on either the matrix or the polynucleotide, or both. Non-covalent binding may be electrostatic, hydrophilic and hydrophobic interactions or combinations thereof. Immobilization or attachment may also involve a combination of covalent and non-covalent interactions.

In the context of the present invention, the term "kit of parts (in short: kit)" is understood to be any combination of at least some of the components identified herein, which are combined, coexisting spatially, to a functional unit, and which can contain further components. Said kit may allow point-of-care testing (POCT).

The term "point-of-care testing (POCT)", as used herein, refers to a medical diagnostic testing at or near the point of care that is the time and place of patient care. This contrasts with the historical pattern in which testing was wholly or mostly confined to the medical laboratory, which entailed sending off specimens away from the point of care and then waiting hours or days to learn the results, during which time care must continue without the desired information. Point-of-care tests are simple medical tests that can be performed at the bedside. The driving notion behind POCT is to bring the test conveniently and immediately to the patient. This increases the likelihood that the patient, physician, and care team will receive the results quicker, which allows for immediate clinical management decisions to be made. POCT is often accomplished through the use of transportable, portable, and handheld instruments and test kits. Small bench analyzers or fixed equipment can also be used when a handheld device is not available - the goal is to collect the specimen and obtain the results in a very short period of time at or near the location of the patient so that the treatment plan can be adjusted as necessary before the patient leaves.

Embodiments of the invention

The inventors of the present invention isolated and characterized novel miRNAs from biological samples. Said novel miRNAs are suitable for use in the diagnosis and/or prognosis of diseases or conditions which are lung, kidney, brain, testis, and/or heart related, or diseases or conditions which are associated with an impaired immune system, cancer or the cardiovascular system.

In a first aspect, the present invention relates to a nucleic acid molecule comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95%, preferably at least 96%, 97%, 98%, or 99%, sequence identity thereto.

The nucleic acid molecule comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 is amiRNA molecule, preferably a miRNA molecule (see Figure 1). The inventors of the present invention have isolated said nucleic acid molecules from lung tissue, kidney tissue, brain tissue, heart tissue, the testis, and/or blood (e.g. plasma or blood cells) (see Figure 1). Thus, said nucleic acid molecules are qualified to be employed as biomarkers in the diagnosis and/or prognosis of diseases or conditions which are lung, kidney, brain, testis, and/or heart related or diseases or conditions which are associated with an impaired immune system, cancer or the cardiovascular system.

It is preferred that the nucleic acid molecule is an isolated nucleic acid molecule.

According to the first aspect,

(i) the (isolated) nucleic acid molecule comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760,

(ii) the (isolated) nucleic acid molecule is a fragment of the nucleic acid molecule according to (i), preferably the (isolated) nucleic acid molecule is a fragment which is between 1 and 12, more preferably between 1 and 8, and most preferably between 1 and 5 or 1 and 3, i.e. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12, nucleotides shorter than the nucleic acid molecule according to (i), or

(ii) the (isolated) nucleic acid molecule has at least 95% or 99%, i.e. at least 95, 96, 97, 98, or 99%, sequence identity to the nucleotide sequence of the nucleic acid molecule according to (i) or nucleic acid molecule fragment according to (ii).

In a second aspect, the present invention relates to a nucleic molecule that is a complement/reverse complement to the nucleic acid molecule according to the first aspect.

In a third aspect, the present invention relates to a vector comprising the nucleic acid molecule according to the first or second aspect.

Preferably, the vector comprises the nucleic acid molecule according to the first aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. It is understood that if said vector is a RNA- vector, it is the RNA-form of the nucleic acid molecule, its complement or a fragment thereof that is comprised in the vector. It is further understood that if said vector is a DNA-vector, it is the DNA-form of the nucleic acid molecule, its complement or a fragment thereof that is comprised in the vector. In a further embodiment, the vector is a pSG5 vector comprising the DNA-form of the nucleic acid molecules according to the first aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

In a fourth aspect, the present invention relates to a host cell comprising the nucleic acid molecule according to the first or second aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or is a complement thereof. The host cell may be transformed, transfected, or infected with the nucleic acid molecule according to the first or second aspect.

Preferably, the host cell is a human cell that comprises the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or is a complement thereof. More preferably, the host cell is a human 293T cell that comprises the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or is a complement thereof.

In a fifth aspect, the present invention relates to a host cell comprising the vector according to the third aspect. Said vector comprises a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or a complement thereof. The host cell may be transformed, transfected, or infected with the vector according to the third aspect.

Preferably, the host cell is a human cell that comprises the vector containing the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or a complement thereof. More preferably, the host cell is a human 293T cell that comprises the vector containing the nucleic acid molecule comprising the nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 or a complement thereof. More preferably, the host cell is a human 293T cell comprising a pSG5- expression plasmid comprising the nucleic acid molecule according to the first or second aspect.

In a sixth aspect, the present invention relates to a primer for reverse transcribing the nucleic acid molecule according to the first aspect.

The primers may be universal or specific primers for reverse transcribing the nucleic acid molecule according to the first aspect. The universal primers for reverse transcribing the nucleic acid molecule according to the first aspect may comprise a poly-T sequence motif. The specific primers for reverse transcribing the nucleic acid molecule according to the first aspect are preferably partially complementary to the 3 '-end of the nucleic acid molecule. It is especially preferred to employ stem-loop RT primers for reverse transcribing the nucleic acid molecule according to the first aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

In a seventh aspect, the present invention relates to a cDNA-transcript of the nucleic acid molecule according to the first aspect.

Said cDNA-transcript is preferably obtained when using the RT -primers according to the sixth aspect. Preferably, a cDNA-transcript of the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760 is obtained when employing the RT-primers according to the sixth aspect of the invention.

In an eighth aspect, the present invention relates to a set of primer pairs for amplifying the cDNA-transcript according to the seventh aspect.

Preferably, primer pairs are provided for amplifying the cDNA-transcript of the nucleic acid molecule according to the first aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

In a ninth aspect, the present invention provides a (synthetic) polynucleotide for detecting the nucleic acid molecule according to the first aspect.

Preferably,

(i) the (synthetic) polynucleotide is reverse complementary to the nucleic acid molecule according to the first aspect,

(ii) the (synthetic) polynucleotide is a fragment of the polynucleotide according to (i), preferably the (synthetic) polynucleotide is a fragment which is between 1 and 12, more preferably between 1 and 8, and most preferably between 1 and 5 or 1 and 3, i.e. 1, 2, 3, 4,

5, 6, 7, 8, 9, 10, 11, or 12, nucleotides shorter than the polynucleotide according to (i), or

(iii) the (synthetic) polynucleotide has at least 90%, preferably at least 95% or 99%, i.e. at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99%, sequence identity to the polynucleotide sequence of the polynucleotide according to (i) or polynucleotide fragment according to (ii).

It is particularly preferred that the polynucleotide as defined in (ii) has at least 90%, preferably at least 95% or 99%, i.e. 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99%, sequence identity over a continuous stretch of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or more nucleotides, preferably over the whole length, to the polynucleotide according to (i).

In addition, the polynucleotide as defined in (ii) (i.e. polynucleotide variant) or (iii) (i.e. polynucleotide fragment variant) is only regarded as a polynucleotide as defined in (ii) (i.e.

polynucleotide variant) or (iii) (i.e. polynucleotide fragment variant) within the context of the present invention, if it is still capable of binding to, hybridizing with, or detecting the respective target nucleic acid molecule, i.e. the target nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation under stringent hybridization conditions. The skilled person can readily assess whether a polynucleotide as defined in (ii) (i.e. polynucleotide variant) or (iii) (i.e. polynucleotide fragment variant) is still capable of binding to, hybridizing with, recognizing or detecting the respective target nucleic acid molecule, i.e. the target nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. Suitable assays to determine whether hybridization under stringent conditions still occurs are well known in the art. However, as an example, a suitable assay to determine whether hybridization still occurs comprises the steps of: (a) incubating the polynucleotide as defined in (ii) or (iii) attached onto a biochip with the respective target nucleic acid molecule, i.e. the target nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, (b) washing the biochip to remove unspecific bindings, (c) subjecting the biochip to a detection system, and (d) analyzing whether the polynucleotide can still hybridize with the respective target nucleic acid molecule. As a positive control, the respective non-mutated polynucleotide as defined in (i) may be used. Preferably stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65°C; or 6x SSPE, 10 % formamide, 0.01 %,Tween 20, 0.1 x TE buffer, 0.5 mg/ml BSA, 0.1 mg/ml herring sperm DNA, incubating at 42°C with wash in 05x SSPE and 6x SSPE at 45°C.

The polynucleotide may be a R A- or DNA-probe capable of binding to the nucleic acid molecule according to the first aspect. Preferably, the polynucleotide is a DNA-probe capable of binding to the nucleic acid molecule according to the first aspect. Said nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. More preferably, the polynucleotide comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3). In particular, the polynucleotide is a DNA-probe comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3).

Which polynucleotide is capable of binding which nucleic acid molecule can easily be taken from Figures 2 and 3. For example, the polynucleotide (dna-probe) with SEQ ID NO: 761 (AgProbe-

1521-5p) is for detecting the novel miRNA candidate (nucleic acid molecule) with SEQ ID NO: 1 (novel-miR- 1521-5p).

In a tenth aspect, the present invention relates to a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, a primer according to the sixth aspect, a cDNA-transcript according to the seventh aspect, a set of primer pairs according to the eight aspect, and/or a polynucleotide according to the ninth aspect for use in diagnosing and/or prognosing a disease or condition.

Alternatively, the present invention relates to the (in vitro) use of a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, a primer according to the sixth aspect, a cDNA-transcript according to the seventh aspect, a set of primer pairs according to the eight aspect, and/or a polynucleotide according to the ninth aspect for diagnosing and/or prognosing a disease or condition.

The disease or condition is preferably a lung, kidney, brain, testis, and/or heart related disease or condition, or a disease or condition associated with an impaired immune system, cancer or the cardiovascular system.

In an eleventh aspect, the present invention relates to a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, and/or a polynucleotide according to the ninth aspect for use as a medicament.

Alternatively, the present invention relates to the (in vitro) use of a nucleic acid molecule according to the first aspect, a nucleic acid molecule according to the second aspect, and/or a polynucleotide according to the ninth aspect for therapeutic intervention (therapy) or treatment, e.g. of a disease or condition.

The medicament may be used to therapy a disease or condition. In particular, the medicament may be used to therapy/treat a lung, kidney, brain, testis, and/or heart related disease or condition, or a disease or condition associated with an impaired immune system, with cancer or the cardiovascular system.

The treatment of a disease or condition may include, but is not limited to, surgery, chemotherapy, radiotherapy, administration of a drug, exercise training, mental training, and/or physical rehabilitation.

In a twelfth aspect, the present invention relates to a method for diagnosing and/or prognosing a disease or condition in a patient comprising the steps of:

(i) determining the level of at least one nucleic acid molecule in a biological sample isolated from the patient, and

(ii) comparing the level to a reference level, wherein the comparison of said level to said reference level allows for the diagnosis and/or prognosis of the disease or condition,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95%, e.g. at least 95, 96, 97, 98, or 99%, sequence identity thereto. The nucleic acid molecule is a miRNA molecule.

The disease or condition is preferably a lung, kidney, brain, testis, and/or heart related disease or condition, or a disease or condition associated with an impaired immune system, cancer or the cardiovascular system.

According to the present invention, the level of the at least one nucleic acid molecule is determined in a biological sample, preferably in a blood sample, more preferably in a blood cell sample derived from a whole blood sample of a patient, preferably a human patient. If the biological sample is a tissue sample, it is usually obtained by biopsy. The whole blood sample is usually collected from the patient by conventional blood draw techniques. Blood collection tubes suitable for collection of whole blood include EDTA- (e.g. K2-EDTA Monovette tube), Na-citrate-, ACD-, Heparin-, PAXgene Blood RNA-, Tempus Blood RNA-tubes. The collected whole blood sample, which may intermediately be stored before use, is processed to result in a blood cell sample of whole blood. This is achieved by separation of the blood cell fraction (the cellular fraction of whole blood) from the serum/plasma fraction (the extra-cellular fraction of whole blood). It is preferred, that the blood cell sample derived from the whole blood sample comprises red blood cells, white blood cells or platelets, it is more preferred that the blood cell sample derived from the whole blood sample comprises red blood cells, white blood cells and platelets, most preferably the blood cell sample derived from the whole blood sample consists of (a mixture of ) red blood cells, white blood cells and platelets.

As mentioned above, the nucleic acid molecule is a miRNA molecule. Preferably, the total RNA, including the miRNA fraction, or the miRNA- fraction is isolated from said blood cells present within said blood cell samples. Kits for isolation of total RNA including the miRNA fraction or kits for isolation of the miRNA- fraction are well known to those skilled in the art, e.g. miRNeasy-kit (Qiagen, Hilden, Germany), Paris-kit (Life Technologies, Weiterstadt, Germany). The miRNA level of is then determined from the isolated RNA.

The level of the at least one nucleic acid molecule may be determined by any convenient means for this purpose. A variety of techniques are well known to those skilled in the art, as defined above, e.g. nucleic acid hybridisation, nucleic acid amplification, sequencing, mass spectroscopy, and/or flow cytometry based techniques or combinations thereof.

Subsequent to the determination of level of the at least one nucleic acid molecule as defined above in step (i) of the method for diagnosing and/or prognosing of a disease or condition, said method further comprises the step (ii) of comparing said level to a reference level, wherein the comparison of said level to said reference level allows for the diagnosis and/or prognosis of a disease or condition.

The reference level may be the level of a (control) subject known to be healthy, i.e. not suffering from the disease or condition to be tested. The reference level may also be the level of a (control) subject known to be diseased, i.e. suffering from the diseases or condition to be tested. It is also possible that more than one reference level is actually used for comparison purposes. For example, a comparison may be made on the basis of a reference level of a (control) subject known to be healthy, a (control) subject known to suffer from a disease or condition, and/or a (control) subject known to suffer from another disease or condition. In this way, also a differential diagnosis may be made. The level may be below, above or comparable with the reference level. Such a (de)regulation may be indicative for the presence of the disease or condition.

In addition, the reference level may be the level of essentially the same, preferably the same, nucleic acid molecule (i.e. nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760) as in step (i), preferably in a biological sample such as blood sample originated from the same source as the biological sample such as blood sample from the patient (e.g. human or animal) to be tested, but obtained from subjects (e.g. human or animal) known to not suffer from a disease or condition, and/or from subjects (e.g. human or animal) known to suffer from a disease or condition. It is understood that the reference level of the nucleic acid molecule (i.e. nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760) is not necessarily obtained from a single (control) subject known to be affected by a disease or condition or known to be not affected by a disease or condition (e.g. healthy subject), but may be an average reference level of a plurality of (control) subjects known to be affected by a disease or condition, or known to be not affected by a disease or condition, e.g. at least 2 to 200 subjects, more preferably at least 10 to 150 subjects, and most preferably at least 20 to 100 subjects. The level and the reference level may be obtained from a subject/patient of the same species (e.g. human or animal), or may be obtained from a subject/patient of a different species (e.g. human or animal). Preferably, said expression profiles are obtained from the same species (e.g. human or animal), of the same gender (e.g. female or male) and/or of a similar age/phase of life (e.g. infant, young child, juvenile, adult) as the subject (e.g. human or animal) to be tested or diagnosed.

The comparison of the level of the nucleic acid molecule (i.e. nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760) of the patient to be diagnosed (e.g. human or animal) to the (average) reference level may then allow for diagnosing and/or prognosing of the disease or condition.

In an alternative embodiment, the method of the method of the present invention, the reference is an algorithm or mathematical function.

It is preferred that the biological sample is a blood sample. The blood sample is preferably a whole blood sample, more preferably a blood cell fraction isolated from a whole blood sample, most preferably a blood cell fraction isolated from a whole blood sample comprising red blood cells, platelets and leukocytes or it is a blood cell fraction isolated from a whole blood sample consisting of a mixture of red blood cells, platelets and leukocytes.

Preferably, the polynucleotide according to the ninth aspect is used in the method according to the twelfth aspect in order to determine the level of the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

The polynucleotide may be a RNA- or DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. Preferably, the polynucleotide is a DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

More preferably, the polynucleotide comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3). In particular, the polynucleotide is a DNA-probe comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3).

Which polynucleotide is capable of binding which nucleic acid molecule can easily be taken from Figures 2 and 3.

In a thirteenth aspect, the present invention relates to means for determining the level of at least one nucleic acid molecule comprising:

(a) at least one polynucleotide according to the ninth aspect,

(b) at least one primer according to the sixth aspect, and/or

(c) at least one set of primer pairs according to the eight aspect,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95%, e.g. at least 95, 96, 97, 98, or 99%, sequence identity thereto. The nucleic acid molecule is a miRNA molecule.

The at least one polynucleotide is preferably comprised on, more preferably attached to or linked with, a matrix. The matrix is preferably selected from the group consisting of a microarray and a set of beads.

The polynucleotide may be a RNA- or DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. Preferably, the polynucleotide is a DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

More preferably, the polynucleotide comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3). In particular, the polynucleotide is a DNA-probe comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3).

Which polynucleotide is capable of binding which nucleic acid molecule can easily be taken from Figures 2 and 3.

In a fourteenth aspect, the present invention relates to means for determining the level of at least one nucleic acid molecule comprising:

a microarray, a RT-PCT system, a PCR-system, a flow cytometer, a Luminex system and/or a next generation sequencing system.

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95%, e.g. at least 95, 96, 97, 98, or 99%, sequence identity thereto. The nucleic acid molecule is a miRNA molecule.

The microarray may comprise the polynucleotide according to the ninth aspect.

The polynucleotide may be a RNA- or DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760. Preferably, the polynucleotide is a DNA-probe capable of binding to the nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

More preferably, the polynucleotide comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3). In particular, the polynucleotide is a DNA-probe comprising or consisting of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 (see Figure 2) and SEQ ID NO: 1521 to SEQ ID NO: 2280 (see Figure 3).

Which polynucleotide is capable of binding which nucleic acid molecule can easily be taken from Figures 2 and 3.

In a fifteenth aspect, the present invention relates to a kit for diagnosing and/or prognosing a disease or condition comprising:

(a) means for determining the level of at least one nucleic acid molecule according to the thirteenth or fourteenth aspect, and

(b) at least one reference level,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95%, e.g. at least 95, 96, 97, 98, or 99%, sequence identity thereto. The nucleic acid molecule is a miRNA molecule.

Herein the level in (a) and the reference level in (b) are determined from at least one isolated nucleic acid molecule according to the first in the same type of biological sample, preferably from age and sex-matched patients/subjects.

The level of the at least one nucleic acid molecule is determined in a biological sample obtained/isolated from a patient. If the level of the at least one nucleic acid molecule is determined in a blood sample, the kit may optionally comprises a tube for blood sample storage. The tube for blood sample storage may be any tube which allows storage of a blood sample for a sufficient time by preventing blood degradation. In particular, the tube for blood sample storage may be a tube which protects the RNA-fraction against degradation. Such a tube may be a PAXgene blood RNA tube. The PAXgene blood RNA tube is a special blood collection tube that has been developed to analyze RNA expression in blood cells. It contains a proprietary reagent that stabilizes intracellular RNA (RNA present in blood cells, including miRNA). After blood collection, the blood is incubated in order to allow the proprietary reagent to stabilize the intracellular RNA. Afterwards, the blood collection tube is centrifuged, which results in the separation of the solid (cellular, blood cells) component of blood and the liquid (extra-cellular, plasma/serum) component of blood. While the liquid component is discarded, the pelleted solid component is isolated and used for downstream (miRNA-) analyses.

Alternatively, the tube for blood sample storage may be an EDTA tube, a heparin tube, a citrate tube, or a Tempus tube.

It is also preferred that the blood sample is selected from the group consisting of whole blood and a blood cellular fraction. It is more preferred that the blood cellular fraction comprises erythrocytes, leukocytes, and thrombocytes. As mentioned above, the blood cellular fraction is produced from whole blood by removing the extracellular fraction (serum and/or plasma). In other words, the blood cellular fraction is depleted of the extracellular blood components, such as serum and/or plasma.

The kit may further comprise a data carrier. Said data carrier may be a non-electronical data carrier, e.g. a graphical data carrier such as an information leaflet, an information sheet, a bar code or an access code, or an electronical data carrier such as a floppy disk, a compact disk (CD), a digital versatile disk (DVD), a microchip or another semiconductor-based electronical data carrier. The access code may allow the access to a database, e.g. an internet database, a centralized, or a decentralized database. The access code may also allow access to an application software that causes a computer to perform tasks for computer users or a mobile app which is a software designed to run on smartphones and other mobile devices.

Said data carrier may further comprise a reference level of the level of the at least one nucleic acid molecule determined herein. In case that the data carrier comprises an access code which allows the access to a database, said reference level may be deposited in this database.

The kit may be used for conducting the method according to the twelfth aspect. The data carrier may comprise information or instructions on how to carry out the method according to the twelfth aspect.

Said kit may also comprise materials desirable from a commercial and user standpoint including a buffer(s), a reagent(s) and/or a diluent(s) for determining the level mentioned above.

The disease or condition is preferably a lung, kidney, brain, testis, and/or heart related disease or condition, or a disease or condition associated with an impaired immune system.

In a sixteenth aspect, the present invention relates to a matrix comprising at least one polynucleotide which comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 and SEQ ID NO: 1521 to SEQ ID NO: 2280. The polynucleotides are suitable for detecting a nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760.

Which polynucleotide is capable of binding which nucleic acid molecule can easily be taken from Figures 2 and 3.

Preferably, the matrix is selected from the group consisting of a microarray and a set of beads. In a preferred embodiment, the present invention relates to a microarray comprising at least 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15, 16, 17 ,18 ,19, 20 ,21, 22, 23, 24, 25 ,26 ,27 ,28 ,29 ,30 ,31 ,32 ,33, 34, 35 ,36, 37 ,38 ,39 ,40 ,41, 42, 43, 44, 45, 46, 47, 48 ,49 ,50, 100, 150 ,200, 250 ,300, 350 ,400, 500, 550, 600, 650, 700, 750, or 760 polynucleotides which comprise a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 and SEQ ID NO: 1521 to SEQ ID NO: 2280.

The invention is further summarized as follows:

A nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

A nucleic molecule that is a complement to the nucleic acid molecule of item 1.

A vector comprising the nucleic acid molecule of items 1 or 2.

A host cell comprising the nucleic acid molecule of items 1 or 2.

A host cell comprising the vector of item 3.

A primer for reverse transcribing the nucleic acid molecule of item 1.

A cDNA-transcript of the nucleic acid molecule of item 1.

A set of primer pairs for amplifying the cDNA-transcript of item 7.

A polynucleotide for detecting the nucleic acid molecule of item 1.

The polynucleotide of item 9, wherein the polynucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 and SEQ ID NO: 1521 to SEQ ID NO: 2280.

A nucleic acid molecule of item 1 , a nucleic acid molecule of item 2, a primer of item 6, a cDNA-transcript of item 7, a set of primer pairs of item 8, and/or a polynucleotide of items 9 or 10 for use in diagnosing and/or prognosing a disease.

A nucleic acid molecule of item 1, a nucleic acid molecule of item 2, and/or a polynucleotide of items 9 or 10 for use as a medicament.

A method for diagnosing and/or prognosing a disease in a patient comprising the steps of:

(i) determining the level of at least one nucleic acid molecule in a biological sample isolated from the patient, and

(ii) comparing the level to a reference level, wherein the comparison of said level to said reference level allows for the diagnosis and/or prognosis of the disease, wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

The method o f item 13 , wherein the level o f at least one nucleic acid mo lecule is determined using at least one polynucleotides of items 9 or 10.

Means for determining the level of at least one nucleic acid molecule comprising:

(a) at least one polynucleotide of items 9 or 10,

(b) at least one primer of item 6, and/or

(c) at least one set of primer pairs of item 8,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

16. The means of item 15, wherein the at least one polynucleotide of items 9 or 10 is comprised on, preferably attached to or linked with, a matrix.

17. The means of item 16, wherein the matrix is selected from the group consisting of a microarray and a set of beads.

18. Means for determining the level of at least one nucleic acid molecule comprising:

a microarray, a T-PCT system, a PCR-system, a flow cytometer, a Luminex system and/or a next generation sequencing system.

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

19. A kit for diagnosing and/or prognosing a disease comprising:

(a) means for determining the level of at least one nucleic acid molecule of any one of items 15 to 18, and

(b) at least one reference level,

wherein the at least one nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 760, a fragment thereof, and a nucleotide sequence having at least 95% sequence identity thereto.

20. A matrix comprising at least one polynucleotide which comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 761 to SEQ ID NO: 1520 and SEQ ID NO: 1521 to SEQ ID NO: 2280.

21. The matrix of item 20, wherein the matrix is selected from the group consisting of a microarray and a set of beads.

Various modifications and variations of the invention will be apparent to those skilled in the art without departing from the scope of invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in the art in the relevant fields are intended to be covered by the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1: Listing of the novel miRNAs having the nucleotide sequence of SEQ ID NO: 1 to SEQ ID NO: 760 (sequence identification number; see sequence listing for nucleotide sequence details), including expression levels obtained from hybridization with total RNA samples derived from different human cells and tissues, including blood cells (including erythrocytes, leucocytes and thrombocytes derived from whole blood collected in Paxgene RNA Tubes), blood plasma, lung tissue, kidney tissue, brain tissue, testis tissue, heart tissue and the Agilent miRNA reference sample. Herein, the blood cell and the plasma samples were employed as a sample pools (see EXAMPLE 3 for details).

Figure 2: Listing of polynucleotides having nucleotide sequences of SEQ ID NO: 761 to SEQ ID NO: 1520 for detecting the novel miRNA candidates (nucleic acid molecules) with SEQ ID NO: 1 to SEQ ID NO: 760. Herein, the sequence with SEQ ID NO: 761 to SEQ ID NO: 1520 represent hairpin- like DNA-probes that were comprised on the Agilent custom microarray that was utilized for validation of the 4649 predicted novel miRNA-candidates. For example, the polynucleotide (dna-probe) with SEQ ID NO: 761 (AgProbe-1 21-5p) is for detecting the novel miRNA candidate (nucleic acid molecule) with SEQ ID NO: 1 (novel-miR-1521-5p).

Figure 3: Listing of further polynucleotides having the nucleotide sequence of SEQ ID NO: 1521 to SEQ ID NO: 2280 for detecting the novel miRNA candidates (nucleic acid molecules) with SEQ ID NO: 1 to SEQ ID NO: 760. For example, the polynucleotide (dna-probe) with SEQ ID NO: 1521 (DNAProbe- 1521 -5p) is for detecting the novel miRNA candidate (nucleic acid molecule) with SEQ ID NO: 1 (novel-miR-1521-5p).

EXAMPLES

The Examples are designed in order to further illustrate the present invention and serve a better understanding. They are not to be construed as limiting the scope of the invention in any way.

EXAMPLE l:Next generation sequencing data sets

We developed MiRmaster as a web-based application to facilitate high throughput-sequencing data analysis of human samples from raw sequencing files provided in the FASTQ format. Building up on the basic principle of miRDeep2 as the most frequently used prediction tool for miRNAs, we implemented an own predictor with an extended feature set .MiRMaster allows to search for novel miRNA candidates, to quantify miRNA expression, to identify iso forms and

variants of miRNAs. To search and detect as many as possible novel miRNA candidates we performed a comprehensive analysis of 1,836 high-throughput sequencing data sets containing 20 billion reads. Herein, we analyzed an in-house NGS miRNA sample collection of 1,097 data sets derived from blood and blood cell components. Further we downloaded 739 samples from 4 series of the GEO database (GSE64142, GSE53080, GSE49279 and GSE45159.) All samples have been sequenced using Illumina Next-Generation sequencing.

EXAMPLE 2: Novel miRNA candidate identification

After having evaluated the performance of our classifier on a positive and negative training set, we applied the model to 1,097 data sets (see EXAMPLE 1). These contain 15 billion reads in a total file size of 486 GB. Employing the newly developed miRMaster software to identify novel miRNAs, in a first step 10,651 miRNA precursors were predicted. Afterwards, the predicted precursors were merged and all sequencing reads aligned to the potential new precursors. From these mappings the novel miRNA candidates were derived. In addition, we mapped the predicted miRNAs to the human non-coding RNAs of Ensembl (release 85) and to NONCODE 201627 using BLAST+28, while allowing an overlap of 90% of the aligned sequences and at most one mismatch. All precursors containing mapping miRNAs were then discarded. Finally, our approach resulted in the identification of 4649 novel miRNA candidates.

EXAMPLE 3: Microarray Validation

Validation using custom microarray: To perform a first pass iteration and to minimize the risk of false positives due to either Next Generation Sequencing (NGS) artifacts or low sample quality containing many degraded RNAs we designed a custom microarray containing all human miRNAs from the miRBase, the miRNAs from the study by Londin and co-workers (Londin, E. etal Proc. Natl. Acad. Sci. U. S. A., 112, El 106-1115) as well the 4649 miRNA candidates identified as outlined in Example 2. Among our predicted miRNAs we selected only those expressed in at least 50 samples which were not flagged as similar to other ncRNAs. The final microarray contained 11,866 miRNA candidates that have been measured each in 20 replicates (237,320 features per sample), including the 760 novel miRNAs with SEQ ID NO: 1 to SEQ ID NO: 760 (see Figure 1). In order to measure the expression of the novel miRNAs in different human cells and tissues, we compiled a set of eight different human RNA samples: we purchased human total RNA samples from lung, brain, kidney, testis and heart tissues from Life Technologies (Cat. No. AM7968, AM7962, AM7976, AM7972 and AM7966, respectively) and the human miRNA reference kit from Agilent Technologies (Cat. No. 750700), that represents a pool of several human tissues and cell lines. Furthermore, we used a PAX blood RNA pool and a plasma RNA pool. The PAX blood

R A pool comprised of 11 blood samples collected in PAX gene tubes and purified with PAXgene Blood miRNA Kit from Qiagen according to manufacturer's instructions. Blood samples derived from four lung cancer patients, two Alzheimer's Disease patients, two patients with Wilms Tumor, and three healthy donors. The plasma RNA pool comprised of 10 plasma samples from healthy donors and was isolated using miRNeasy Serum/Plasma Kit after manufacturers recommendation with minor adaptations. To ensure sufficient RNA precipitation, we added 1 μΐ 20 mg/ml glycogen (Invitrogen) in the precipitation step. RNA concentration was measured using Nanodrop (ThermoFisher). RNA quality was assessed using Agilent Bioanalyzer Nano kit (for all tissue derived RNAs) or Small RNA kit (for the plasma sample).

The expression of 11,866 miRNAs miRNA candidates, including the 760 novel miRNAs with SEQ ID NO: 1 to SEQ ID NO: 760, was determined using a customized Agilent human miRNA microarrays, comprising the polynucleotides with SEQ ID NO: 761 to SEQ ID NO: 1520 as DNA-probes for binding the corresponding miRNA-targets with SEQ ID NO: 1 to SEQ ID NO: 760. Herein, the expression of the 760 novel miRNAs was determined using the corresponding hairpin-like DNA-probes with SEQ ID NO : 761 to SEQ ID NO : 1520. As input we used 100 ng total RNA as measured in Nanodrop for all tissue derived RNAs, and 1 ng miRNA as measured using Bioanalyzer Small RNA chip for the plasma sample. Using Agilent miRNA Complete Labeling and Hyb Kit after manufacturer's instructions, RNAs were dephosphorylated and labeled with Cy3-pCp. Labeled RNAs were hybridized to the custom microarrays for exactly 20 hours at 55°C. After hybridization, arrays were washed for 5 min in each Gene Expression Wash Buffer 1 (room temperature) and 2 (37°C). Subsequently, arrays were dried and scanned in an Agilent microarray scanner (G2505C). Expression data was extracted using Agilent feature extraction software. The expression levels of the 760 novel miRNAs, which were determined utilizing the DNA-probes (polynucleotides) with SEQ ID NO: 761 to SEQ ID NO: 1520, are shown in Figure 1, including the different expression levels found in different human cells and tissues as outlined above. Downstream processing of signals has been carried out with R (version 3.2.4). Specifically, for clustering the expression intensities hierarchical clustering using the Euclidean distance has been performed as implemented in the Heatplus package.