什么是rsf拍案叫绝的近义词ylo?

拒绝访问 |
| 百度云加速
请打开cookies.
此网站 () 的管理员禁止了您的访问。原因是您的访问包含了非浏览器特征(3b2ac806c2824370-ua98).
重新安装浏览器,或使用别的浏览器指尖上的蜗牛喜欢的音乐 - 歌单 - 网易云音乐
指尖上的蜗牛喜欢的音乐
播放:229次
网易云音乐多端下载
同步歌单,随时畅听320k好音乐
网易公司版权所有(C)杭州乐读科技有限公司运营:当前位置: >>
Complete nucleotide sequence and gene organization of the broad-host-range plasmid RSF1010.
Gene, 75 (8Elsevier GEN 02884271Compiete nucleotide sequence and gene organizationof the broad-host-rangeplasmid RSFlOlO(Recombinant DNA; incompatibility group Q; a DNA
bacteriophages 1, T7, M13)Peter Scholz *, Volker Haring, Brigitte Wittmann-Liehold, Keith Ashman *, Michael Bagdasarian * and Eherhard ScherzingerMax-Fianck Institutefor Molecular Genetics, D-1000 Berlin-33 (F.R.G.)Received Accepted: by A.M. Chakrabarty: 6 November 1988 2 August 1988SUMMARYWe present the complete nucleotide sequence of RSFlOlO, a naturally occurring broad-host-range plasmid belonging to the Escherichia coli incompatibility group Q and encoding resistance to streptomycin and sulfon~ides. A molecule of RSFlOlO DNA consists of 8684 bp and has a G + C content of 61%. Analysis of the distribution of translation start and stop codons in the sequence has revealed the existence of more than 40 open reading frames potentially capable of encoding polypeptides of 60 or more amino acids. To date, products of eleven such potential RSFlOlO genes have been identified through the application of controlled expression vector systems, and for eight of them, the reading frame has been confirmed by N- and/or C-terminal amino acid sequence determinations on the purified proteins. The sequencing results are discussed in relation to the systems of replication, host range, conjugal mobilization and ~tibiotic resistance dete~in~ts associated with the RSFlOlO plasmid.The small, nonconjugative, IncQ group plasmid, RSFlOlO, which specifies resistance to Sm and Su, is of special interest because of its broad host range replication properties among ~~-negative bacteCorrespondence CO: Dr. E. Scherzinger,Molecular * Present Lansing, Genetics, addresses: D-1000 Berlin-33 242; Fax (30) . (M.B.) Michigan AG, Biochemical Biotechnolo~ Institute, Institute, MI 48909 (U.S.A.) Tel. (30)468-1236; Tel. (517)337-3181; (K.A.) European (F.R.G.) Fax (517)337D-1000 Berlin Molecular Biol-ria (for a recent review, see Frey and Bagdas~ian, 1989). Previously published work has identified and mapped oriF’, the unique origin of vegetative DNA replication (De Graafet al., 1978; Scherzinger et al., 1984; Haring and Scherzinger, 1989), repA, repB, and repC, the genes for essential, positive-actingAbbreviations: aa, amino HPLC, acid(s); bp, base pair(s); liquid open ds, doubleMax-Planck-Institute (F.R.G.)forstrand(ed); IPTG, merasehigh-performance ORF, PAGE, kb, 1000 Km,Tel, (30)8307-isopropylB-D-tnt, nucleotide(s); PolIk, Klenowreadingori,origin of DNA I; RF, replicativepolyac~l~idegel el Tn, state.(large) fragmentof E. coli DNA polySu, plasmid-carrier2122; (P.S.) Schering 65 (F.R.G.) ogy Laboratory,SDS, sodiumSm,ss single strand(ed);6900 HeidelbergElsevierTel. (62).B.V. (Biomedicalwt, [ ] designates0378-i119/89i$O3.5001989Science PublishersDivision) 272replication proteins (Scherzinger et al., 1984; Scholz et al., 1984; Hating et al., 1985), oriT, the site of the relaxation complex and origin of conjugations DNA transfer (Nordheim et al., 1980; Derbyshire and Willetts, 1987), mobA, mobB, and mobC, genes encoding truns-active proteins involved in plasmid mobilization (Bagdasarian et al., 1982; Derbyshire et al., 1987), and sul and str, the loci conferring resistance to Su and Sm, respectively (Heffron et al., 1975; Bagdasarian et al., 1982). To fully understand the functional organization of the plasmid and to facilitate its use as a cloning vector, we have determined, and report here, its entire nucleotide sequence of 8684 bp. All plasmid genes listed above have been identified in the sequence, and three hitherto unrecognized protein-coding genes have been revealed. The major transcription signals for E. colt RNA polymerase have also been identified. Where possible, the accuracy of the sequence was checked by restriction analysis, or more rigorously, by terminal amino acid sequence determination carried out on purified RSFlOlO proteins. There is nothing currently known to us about the genetics of RSFlOlO or the physical properties of its DNA that is in disagreement with the sequencing results. We believe, therefore, that the error frequency in the nucleotide sequence presented here is very low.TABLE 1 Plasmids used for overproduction and identification of RSFlOlO proteins Plasmid RSFlOiO DNA inserted downstream from rut or T7 $10” Target ORFs’ Ref. ’Plasmids derived from ptac 12: pVH2 nt
Plasmids derived from pKK223-3: pVH3 nt
Plasmids derived from pT7-5: pOT740 nt
Plasmids derived from pT7-6: pOT78 1 nt
nt 751 I-8555 pOT784 pOT762 nt
pOT62d 1 nt
d( ) pOT62d2 nt
d( )A E,F,A,C1 1D,B’ B,B’,D E,F HJ12 2 2HJ HJ H (H)2 2 2 2G (G) K K (K)2 2 2 2 2MATERIALS AND METHODS(a) Bacterial strains, plasmids and phage The strains used were all E. coli K-12 derivatives. SK1592 (Kushner, 1978) transformed with RSF1010 (Guerry et al., 1974) or pKT228 (RSFlOlO: : Tn3 Mob- Scherzinger et al., 1984) was used as the source of plasmid DNA for nucleotide sequencing. JMlOl served as the host strain to recover M13mp8 and M13mp9 phages (Messing, 1983) containing DNA subfragments to be sequenced. A list of recombinant plasmids that overproduce RSFlOlO proteins and that have been constructed in the course of this study is given in Table I. The tat promoter containing pBR322 derivatives ptucl2 (Amann et al., 1983) and pKK223-3 (Brosius and Holy, 1984), which served as cloning vectors ina All inserts arc in one or between two unique restriction sites immediately downstream from the inducible promoter (tat in ptacl2 and pKK223-3; T7 $10 in pT7-5 and -6) and all except those in pOT762, pOT762dl pOT762d2, are oriented such that transcription from the respective promoter would proceed from left to right on the conventional RSFIOIO map (Fig. 2). A indis cates deletion. Positions of deleted nt are indicated in brackets. b The ORFs listed are those known to lie within the RSFlOlO fragment carried by that recombinant plasmid (see Fig. 1). In pOT745 the reading frame for protein H is truncated at nt 748, which should yield a fusion protein that is 25 aa shorter than the full-length (267 aa) ORF H product. In pOT784 the reading frame for protein G is truncated at nt 8555, which should yield a fusion protein that is 10 aa shorter than the full-length (262 aa) ORF G product. Plasmid pOT762d2 carries the entire coding sequence for protein K except for an internal 4-bp deletion (nt ), which should result in frame shift and premature termination of translation 10 bp after the deletion, thereby producing a protein of 54 instead of 94 aa. (G), (H) and (K) indicate truncated ORFs. c References are: 1,Haringet al. (1985); Haring and Scherzinger (1989); 2, constructed in this laboratory and to be described in detail elsewhere. 273the construction of plasmids of the pVH series, have been described. These plasmids were maintained in strain HBlOl (Boyer and Roull~d-Dussoix, 1969) harboring pVH1 (Hating et al., 1985), a ColD-based Km plasmid carrying the Ia@ repressor gene. Plasmids pT7-5 and pT7-6 are pBR322-based vectors containing the bacteriophage T7 $10 promoter (S. Tabor and C.C. Richardson, ~published). These expression vectors and the recombinant plasmids derived thereof (POT series, Table I) were maintained in HB 101 harboring pGPl-2, a plasmid carrying the T7 RNA polymerase structural gene under the control of the heat-induI expression system (Tabor and Richardson, 1985).(d) DNA purification and recombinant DNA techniques The following procedures were performed using standard techniques as described in Maniatis et al. (1982) and by the manufacturers of enzymes: preparation of supercoiled plasmid and M 13 phage RF1 DNA, restriction enzyme digestion, agarose gel electrophoresis, DNA ligation and bacterial transformation. DNA fragments were recovered from gel slices by electroelution into dialysis bags followed by adsorption to and elution from Elutip-d columns (Schleicher & Schuell). Ml3 viral DNA used for sequencing was extracted from purified phage as described by Messing (1983). (e) Nucleotide sequence analysis Five overlapping DNA fragments from the wt RSFlOlO, the I.4-kb AccI-Avaf fragment (coordinates 3.6-5.0 kb), the 1.6-kb EcoRV-X/r011 fragment (4.5-6.1 kb), the 2.3-kb AccI-PstI fragment (5.5-7.8 kb), the 1.2-kb BstEII-EcoRI fragment (7.5-8.7 kb) and the 2.1-kb PstI-XmnI fragment (8.6-2.0 kb), plus the 2.5-kb PvuII?stI fragment from pKT228 (covering the RSFIOlO region from coordinate 1.9 to 4.0) served as the primary source of DNA for sequencing (see Fig. 2). These fragments, purified by 0.8% or 1.0% agarose gel electrophoresis and made blat-ended by treatment with T4 DNA polymerase and dNTPs, were inserted into the H&c11 site of M13mp8 or M13mp9, and pairs of clones containing the fragment of interest in opposing orientations were selected. Sets of nonrandom, overlapping deletion fra~ents suitable for sequencing were then generated from these recombinant clones by limited BAL 31 digestion as described by Poncz et al. (1982). The resulting fragments, recloned into appropriately cut M 13mp8 or M 13mp9 vectors, were subjected to DNA sequencing by the dideoxy chin-termination method of Sanger et al. (1977) as modified by Biggin et al. (1983) and Garoff and Ansorge (198 1). All sequences were confirmed from at least two overlapping clones, and the entire RSF 1010 sequence was determined on both strands of the DNA. The sequence information was analyzed using the latest release (version 5) of the UWGCG software (Devereux et al., 1984).(b) Enzymes and chemicals Restriction endonucleases and other enzymes used in cloning were obtained from Boehringer-Mannheim, New England Biolabs or Bethesda Research Laboratories. PolIk, the Ml3 universal sequencing primer (17-mer), unlabeled deoxy- and dideoxyribonucleoside triphosphates, [ a-35S]dATP (approx. 600 Ci/mmol), and L-[ 35S]methionine (approx. 800 Ci/mmol) were from the Radiochemical Centre, Amersham. Carboxypeptidase P was from Boehringer and protein A4, standards were from Pharmacia. All other chemicals were of the highest commercial purity available. (c) RSFlOlO-encoded proteins Proteins A, B, B’, C, D, E, F and H of RSFlOlO were isolated by conventions chromato~aphy procedures from IPTG-induced cells of E. coli HB 101[ pVH I] carrying the following overproducing plasmids (Table I): pVH2 (A), pVH9 (B), pVH3 (B’ and D), pVH4 (C), pVH25 (E and F), and pVH29 (H). The purification and ch~acterization of each protein will be reported elsewhere (Haring and Scherzinger, 1989; E.S., unpublished data). An SDS-PAGE analysis of our present preparations of the above eight plasmid-encoded proteins is shown in Fig. 3. Solutions of the purified proteins (2-5 mg/ml) in 20 mM Tris * HCl pH 7.6/O. l-O.5 M NaCI/l.O mM dithiothreitol/lO% ethylene glycol were kept frozen at -70°C. (f) Amino acid composition and sequence determinationProteins were dialyzed extensively against water adjusted to pH 4.3 with acetic acid. Salt-free protein samples were hydrolyzed by treatment with 5.7 N HCI, 0.02% mercaptoethanol in vacua at 110°C for 24 h and analyzed with a Durrum model 500 amino acid analyzer. N-terminal amino acid sequence analysis of the salt-free protein preparations was carried out by an automatic Edman degradation procedure on a selfconstructed sequencer equipped with an on-line HPLC phenylthiohyd~toin amino acid identification system (Wittmann-Lieboid and Ashman, 1985). Amino acids from the C-terminus of polypeptides were determined as follows. Carboxypeptidase P (5 pg) was added to approx. 10 nmol of protein (as monomer) in 200 ,ul of dilute acetic acid (pH 4.3) and the mixture was incubated at 37 “C. Portions (20 ~1) were withdrawn at various times (usually after 1, 5, and 10 min), and the reaction was terminated by addition of an equal volume of iodoacetic acid (1 y0 w/v in H,O). The solvents were evaporated at room temperature in vacua and the residue was redissolved in 60 1.11 of 0.2 M citric acid-NaOH (pH 2.2). Amino acids liberated by carboxypeptidase P action were determined, after reaction with ortho-phthaldi~dehyde and mercaptoeth~ol, by HPLC and lluorospectrophotometric detection as described (Ashman and Bosserhoff, 1985).(g) Other methodsas a reference point to present the nucleotide sequence and the physical and genetic maps of the plasmid (Figs. 1 and 2). The 5’ end of the strA gene is referred to as the left end, and the 3’ end of the sul gene as the right end. The DNA strand which has the same sequence as the mRNAs coding for all but one of the eleven known RSFlOlO proteins (see below) has been arbitrarily designated as I and its complement as r. The sequence presented in Fig. 1 is that of the 1 strand. We found that the plasmid consists of 8684 bp and has a 61 y0 G + C content. The M, ofthe sodium salt of RSFlOlO DNA was calculated to be 5.72 x 106. The sequence contains previoudy mapped sites for&c1 (nt positions ),~~ffI (258, , ), BstEII (7510), &a11 (2745), EcoRI (8676), EcoRV (), HincII (, 8682), &a1 (8682), NsiI (8253) PstI (), PvuII (), Sac1 (201), ScaI (), SphI (161, 294, 619), SspI (75, ), XhoII () and XmnI (2030) (Bagdasarian et al., ; Haring and Scherzinger, 1988; and unpublished data). In addition, sites forAat11(),,4fl111(4586), Ban11 (201), NnrI (, 7949), Not1 (1664), NpuI (169,441), Sac11 (194), SfI (2187), @fI (1593), and XmaIII (1665) are predicted. Sites for the endonucleases BarnHI, BglII, &I, HindIII, KpnI, NdeI, Pvul, SalI, SmaI and XhoI do not appear in the sequence. The sequence of the mobilization region (nt position ) has been previously reported by another laboratory (Derbyshire and Willets, 1987; Derbyshire et al., 1987) and is in complete agreement with that presented here.(b) Coding regions of the RSFlOlO genomePolyacrylamide gel electrophoresis of proteins under denaturing and reducing conditions was as described (Lugtenberg et al., 1975). Gels were either stained with Coomassie brilliant blue Serva R-250 or analyzed by autora~o~aphy. Protein was determined by the dye-binding method of McKnight (1977).RESULTS(a) The nucieotide sequenceRSFlOlODNAcontains auniqueHpaIrestriction site (Bagdasarian et al., 1982). This site will be usedThe sequence was examined in both directions for potential protein-coding regions, that is, regions starting with an ATG or GTG codon and terminating after 60 or more coding tripiets with any of the three stop codons. 49 such translation ORFs were found, 24 on the 2 strand and 25 on the r strand, as shown on the diagram of the sequence (Fig. 2, open and shaded boxes). To test which potential RSFIOIO-coding sequences were actually expressed, various fragments of the plasmid (i.e., those relevant to this study and specified in Table I) were cloned onto controlled expression vectors and unlabeled or 35S-labeled 275.1~cTGCAcAilCGGGATAT;TCTCTATATiCGCGCTTCA~~~CT~G~CCT~~TT~TC~CTMTAT~TTTTYTGGTGMTCGCATT~H101 . . TGACTGGTTCCCTGTCACAGGCG~~TCTGGT~TTT;GCCTGCTTCCCGCCGCGG~ DULPVRGGESGDFVFRRGDGHAFAKtAPASRRG 201-NNRlfddFFGESHS301401CAGCCTATC~GTTGATCAA;GTCCGTTTG~GCGCAGGCT~TCGCGMTG;TCGCACGCG~CGTTCATGT~GTGTCCCGC~TGCCGTCA;TCCCGACTT~ SLSVDPCPFERRLSRMFGRAVDVVSRNAVNPDF501TTACCGGACtAGGACAAGAGTACGCCGCTtCACGATCTT;GAGCGCAC~GATATGGTT~ LPDEDKSTPLHDLLARVERELPVRLDQERTDMVV601TTTGCCATGGTGATCCCTGCTGCCCWVICiTCATGGTGGA CHGDPCHPNFHVDPKTLClCTGLIDLGRLGTADR701CTATGCCGATTTGGCACTC~TGATTGCTAACGCCGAAGA~AACTGGGCA~CGCCAGATG~GCAGAGCG~GCCTTCGCT~TCCTATTCA~TGTATTGGG~ YADLALMIANAEENWAAPDEAERAFAVLFNVLG80fATCGAAGCCCCCGACCGCGACGCCTTGCCfTCTATCTGCGATTGGACCCTCTGACTTG~GGTTGATGT~CATGCCGCC~GTTTTTCCTECTCATTGGC; IEAPDRERLAFYLRLDPLTUG* WFMPPVFPAHWH I CGTTTCGCA/;CCTGTTCT~TTGCGGACA~CTTTTCCAG~CTCGTTTGG~GTTTCAT~GCCAGACGG~CTCCTGC~TCGTCAAGG~ATTGAAACC; VSQPVLIADTFSSLVWKVSLPOGTPAIVKGLKP90110011101TGCTCGAAT;TGCCGGGGACCGMTGCTCiCTCACATCG; LEYAGERMLSHIVAEHGDYQATEIAAELMAKLY TGCCGCATC;GAGGAnCCCcTGCCTTCTG~CCrTETCCC~ATCCGG~T~GCTTTGCAG~TTTGTTTCA~CGGGCGCGC~ATGATCAAA~CGCAGGTTG~ AASEEPLPSALLPIRORFAALFQRARDDQNAGC CAAACTGAC;ACGTCCACGCGGCCATTAT;GCCGATC~TGATCAICACEA/;TGCTGCGTGGGETACATGGCGATCTGCATCA;GAACATCA PTDYVHAAIIADPMNSNASELRGLHGDLHHENIM TGTTCTCCA~TCGCGGCTG~CTGGTGATA~ATCCCGTCG~TCTGGTCGGiGAGTGGGC~TTGGCGCCG~CAATATGTT~TACGATCCG~CTGACAGAG~ FSSRGWLVIOPVGLVGEVGFGAANHFYDPADRD1501CGACCTTTG;CTCGATCCTnGACGCATTG~ACAGATGGCCCGGCGTACGC~ DLCLDPRRIAPMADAFSRALDVOPRRLLDQAYA160101 27601CCCGGGTM~ACTCCTCM~GATCTGATA~ACAAGGGTTiGCTCGGGTC~GTGGCTCTC~TMCGACCA~TATCCCGAT~CCGGCTGGC~GTCCTGGCC~ CCACATGAG~CATGTTCCG~GTCCTTG~TACTGTGTTiACATAEAGT~TATCGCTTA~CG~GTT~TTTTACCCT~GCCGAMT~CCTGCCGTT~ CTAGACATTGCCAGCCAGT;CCCGTCACTCCCGTACTAACCCCTGC & 3-t . . . . . . __ __._~_____________________._~.._...____-----__~ 2401AATAACTCTCACECC~CC~LAACCAG~AGGG~CGGGGGCACGG~GGGGTGTTGGAAAAATCC~TCCATGATTATCTMGMTAATCCACTAGE~ ._._.___________...*..___________...”______________________~.-~~..~~__~..~.“~~.~~~~~..__~~...~..____ -.-**2501CGCGGTTATCAGCGCCCTTGTGGGGCGCT~CTGCCCTTGEACACACCG __._.___________..._..---______________._______.0riV____._________.__..______________*~”.......__ f._._.-....._..-.-2601._._.._....._” ._..-.....-.-.-., CCCCACCGCTGCGCGGCAGCGGCAAACCCCCCCAAAGCCCTACGCTGGAGCGCTTTTAGCCGCTTTAGCGG .-~._____.._._.._.._______________~~.......~_~__..--.....”~.-.--~.~~.~..~~..-.-~.~~..~_.__....._____2701.-..*.._.*_._..._ *CCTTTCCCCCTACCCWU\GGGTGGGGGCGCGTGTGCAGCCCCGCAGGGC~TGTCTCGGT~GATCATTCA~CCCGGCTCA~CCTTCTGGC~TGGCGGCAG~ _-.__._*_________.._________._~~~~......~_ *GPEDKQRPPL i CCGAAC~G~CGCGGTCGT~GTCGCGTTC~GGTACCCA;CCATTGCCG~CATGAGCCG~TCCTCCGGC~ACTCGCTGC;GTTCACCTT~GCCAAAATC~ GFLARDHDRELYADMAAMLRDEPUESSNVKALI28012901TGGCCCCCnCCAGCACCTT;CGCCTTGTTCTT;CCTTGC~CTCT?GCTG~TGTTCCCTT~CCCGCACCC~CTG~TTTC~GCATTGATT~GCGCTCGTT~ MAGVLVKRRTENKREQQQERARVRQIEANIRARQ3001TTCTTCGAGCTTGGCCAGCCGATCCGCCGECTTGTTGCTC EELKALRDAAKNSGKVMcK } ._.._ _ .._.. __ . . . . _3101_,._j -..-.+ +-_..._._ CAGCGGCGGCMTCCCGACCCTACTTTGTAGGGGAGGGCGCACTTACCGGTTTCTCTTCGAG~CTGGCCTMCGGCCACCCTTCGGGCGGTGCGCTCT .--*-._----___...._.___ oti T ______.__.___________~~~......~_____~.~ CCGAGGGCC~TTGCATGGAGCCCAAAAGC~CA~C~GGCAGC~TGGCGATTT~TCACCTTAC~GCG~CC~GCAG~GGT~GGGCGGCC~ B(B’)+MAIYHLTAKTGSRSGGQ32013301TCGGCCAGG~CCMGGCCG;CTACATCCA~CGCGMGGC~GTATGCCC;iCCACATGGA;GMGTCTTG~ACGCCGMT~CGGGCACAT~CCGGAGTTC~ SARAKADYIffREGKYARDMDEVLHAESGHMPEFV TCCAGCGGCCCGCCGACTACTGGGATGCT~CCCWICCTGTA ERPADYUDAADLYERANGRLFKEVEFALPVELT34013501CCTCGACCACCAGMGGCGiTGGCGTCCG~GTTCGCCCAGCACCTGACC~GTGCCGAGC~CCTGCCGTA~ACGCTGGCC~TCCATGCCG~TGGCGGCGA~ LDQPKALASEFAPHLTGAERLPYTLAIHAGGGE3601AACCCGCACiGCCACCTGAicATCTCCGACCCGGATCMTG NPHCHLHISERINDGIERPAAQUFKRYNGKTPEK370138013901CGGGGCATCCGCACCGACCGGGCAGACGTGGCCCTWUCjl RGIRTDRADVALNIDTANAQllDCPEYREAfDHED-L4001 MCGCMTCCACAGAGTGAAGAAATCCAG~GGCATC~C~AGTTAGCGG~GCAGATCG~CCGCTGGCC~AGAGCATGG~GACACTGGC~GACGAAGCC~ RNRQSEEIPRHQRVSGADRTAGPEHGDTGRRSP NAIDRVKKSRGINELAEQIEPLAQSMATLADEAR 4101GGCAGGTCAiCAGccAGAciCAGcAGGcc~~c~GGCGC~GGCGGCGGA~TGGCT~~CCCAGCGCC~~CAGGGGC~GCATGGGTG~GClGGCC~ AGHEPDPAGPRGAGGGVAESPAPDRGGHGGAGQ ~VHSQTPQASEAPAAEWLKAC!RQTGAAUVELAK4201AGAGT,GCG~GAGGTAGCC~CCGAGGTGA~CAGCGCCGC~CAGAGCGCC~G~GCGCGT~GCGGGGGTG~~CTG~G~TATGGCT~~CGTGAlGCT~ RVAGGSRRGEPRRAERPERVAGVALEAMANRDAG ELREVAAEVSSAAPSARSASRGUHUKLULTVML4301GCTTCCATG~TGCCTACGGTGGlGCTGCTFjATCGCATCG~TGCTCTTGC;CGACCTGACGCCACTGAC~CCGAGGACGiiCTCGATCTG~CTGCGCTTG~ FHDAYGGAADRIVALARPDATDNRGRLDLAALG ASBMPTVVLLIASLLLLDLTPLTTEDGSIUlR_&_J_4401TGGCCCGATGMGAACGACAGGACTTTGCACCCATAGGCCEACAGCTC~GGCCATGG~CTGTGAGCG~TTCGATATC~GCGTCAGGG~CGCCACCAC~ GPMKNDRTLPAIGRQLKAMGCERFDIGVRDATT *A 4501GGCCAGATG~TGAACCGGG~TGCTCAGC~GCCG~GlG~TCCAGAACA~GCCATGGCT~MGCGGATG~TGCCCAG~~CMTGACGT~TATATCAGG~ GQMMNR~WSAAEVLC!NTPULKRMNAQGNDVYIRP4601CCGCCGAGCnGGAGCGGCA;CGTCiGCTCCTCCTCGnCGnCCCTGGlAG~ AEPERHGLVLVDDLSEFDLODMKAEGREPALVV4701GGAAACCAG~CCGAAGAAC;ATCAGGCAT~GGTCMGGT~GCCGACGCCtCAGGCGGTG~CTTCGGGG~CAGATTGCC~GGACGCTGG~CAGCGAGTA;: ETSPKNYPAUVKVADAAGGELRGPIARTLASEY480149015001GAAGGCCCGCAGGCTGGCCAGCCTCGAACiGCCCGAGCG~CAGCTTAGC~GCCACCGGC~CACGGCGCT~GACGAGlAC~GCAGCGAGAiGGCCGGGCT~ KARRLASLELPERQLSRHRRTALDEYRSEMAGL51015201CCGAGGCCA;;CCCAGCGCTtCCAGAGCGC~GCCCGGCCCGGTCATCGGTCTGCCCAGCGTCCAGCi EASPALAERKPGHEADYIERTVSKVMGLPSVQL53015401CTGTTATACiATGAGTACTCACGCACAGAACGGGGGTTTTiAAGG~CTAT~E5501 LPGGFGYTSNKAEAGRFSVADMASLNLDGCTLS-eHEYEKSASGSVYLIKSDKGYUGGTTGCCCGaTGGCTTTGG;TATACGTCAAACAAGGCCGj;GCTGGCCGCTTTTCAGTCGCTGATATGGECAGCCTTAACCTICACGGC;G5601CTTGTTCCGCGAAGACAAGCCTTTCGGCCCCGGCAAGTTi LFREDKPFGPGKFLGD*F+5701 J_J_VRRARYAERRKAKGRRQRKFULTDDEYEALRMKDPKOKPTGDLLASPTGACGCTGT;\CGCCAAGCGCGATATGCCGAECGCATGAA~GCC~AGG~TGCGTCAGC~C~GTTCTG~CT~CCGAC~ACG~TAC~GGCGCTGCG~58015901 2786001CCCGGTGGT~CCGGTMAT~CTGCTGGC~CTGCMCTGdCCGCACA~~lGCAGGCGG~CCGGAlCTG~TGGAGGlGG~C~CTGCC~ACCGGCCCG~ PGGAGKSHLALQLAAPIAGGPDLLEVGELPTGPV61016201CGGCCTGCT~TCCAGCCG~TGATCGGCA~CCTGCCCM~ATCATGGCC~CGGAGTGGTiCGACGGCCT~MGCCCGCC~CCGAGGGCC;;CCGCCTGAT~ GLLIOPLIGSLPNIHAPEUFDGLKRAAEGRRLM63016401GGTGCTCTA;CGTGTTCCTCCACCATGCC;\GCAAGGGCG~GGCCATGAT~GGCGCAGGC~CCAGCAGC~GGCCAGCCG~GGCAGCTCG~TACTGGTCG~ CSIVFLHHASKGAAHHGAGDQQQASRGSSVLVD6501TMCATCCGCTGGCAGTCCTACCTGTCGAGCATGACCAGCGCCGAGGCCGAGGMTGGGGTGTGGACGACGACCAGCGCCGGTTCTTCGTCCGCTTCGGT NIRUPSYLSSMTSAEAEEUGVDDDQRRFFVRFG6601GTGAGCAAGGCCAACTATGCCGCACCGTTCGCTGATCGG;GCCCGCCG~GCTGGAGAG~CAGCGCAAG~ VSKANYGAPFADRUFRRHDGGVLKPAVLERQRKS6701GCMGGGGGTGCCCCGTGGTGMGCCTAAGAACAAGEAW\GACCCGGCGCACTGTCTGGCCCCCGGCCTG7TCCG7GCCC KGVPRGEA* clMVKPKNKHSLSHVRHDPAHCLAPGLFRAL6801TCMGCGGGGCGAGCGCMGCGCAGCAAGCTGGACGTW\CGCCGCTGGGCGCTGA KRGERKRSKLDYTYDYGDGKRIEFSGPEPLGAD6901TGATCTGCGCATCCTGCMCGGCTGGTGGCCATGGCTGG~CCTMTGGC~TAGTGCTTG~CCCGGMCC~MGACCG~~GCGGACGGC~GCTCCGGCT~ DLRILQGLYAMAGPNGLVLGPEPKTEGGRQLRL7001TTCCTGGMCCCMGTGGCAGGCCGTCACCGCTGMTGCCATGTGGTC~GGTAGCTA~CGGGCGCTG~C~GG~TCGGGGCAGAGGTCGATAGT~ FLEPKWEAVTAECHVYKGSYRALAKEIGAEVDSG71017201.7301ATGGACeAGE;TGCGGGCGC;GGACAGCGRAACCGCCCGC~TGCTGCACCAGCGGCTGTG~GGCTGGATC~ACCCCGGC~CCGGC~GGCTTCCATA~ MDEVRALDSETARLLHQRLCGUIDPGKTGKASID . ATACCTTGTGCGGCTATGTCTGGCCGT~GAGGCCAGTGGTTCGACCATGCGCMGCGC~GCCAGCGGG~GCGCGAGGC~TTGCCGGAGCTGGTCGCGC~ TLCGYVUPSEASGSTHRKRRPRVREALPELVAL7401750101CGCCTGGGGATTCCCTTTCEACCCGAGCATCCGTATGATACTCATGCTC~TTATTATT~TTATAGMG~CCCCATG~TA~TCGCTCAT~TTTTCGGG7901MNKSLIIFGCATCGTCMCATMCCTCGCACAGTTTCTCCGATGGAGG~CGGTATCTG~CGCCA~CG~AGCCATTGC~CAGGCGCGT~GCTGATGGCCGAGGGGGC~ IVNITSDSFSDGGRYLAPDAAfftPARKLMAEGA 2198001GAlGTGATCGACCTGGTCCCGCATCCAGC~TCCCGACG~CGCGCCTGTiTCGTCCGAC~CAGAAATCG~GCGTATGCG~CGGTGCTGG~CGCGCTCAG~ DVIOLVRHPAIPTPRLFRPTQKSRVCAGAGRAQA8101CAGATGGCATTCCCGTCTC~CTCGACAGTTATCAACCCGCGACGC~GC~TATGCCTTGiCGCGTGGTGiGGCCTATCTCMTGATATT~GCGGTTTTC~ OGIPVSLDSYPPATPAYALSRGVAYLNDIRGFP8201ACACGCTGCGTTCTATCCG~MTTGGCGAAATCATCTGC~AAACTCGTC~TTATGCATTCGGTGCMGA;IGGGCAGGCAGATCGGCGCG~GGCACCCGCi DAAFYPQLAKSSAKLVVMHSVPDCPADRREAPA8301GGCGACATC~TGGATCACAiTGCGGCGTT~TTTGACGCG~GCATCGCGG~GCTGACGGGiGCCGGTATC~MCGCMCC~CCTTGTCCTiGATCCCGGC~ GOIMDHIAAFFDARIAALTGAGIKRNRLVLDPGM8401TGGGGTTTTTTCTGGGGGCTGCTCCCGAAACCTCGCTCTCGGTGCTGGCGCGGTTCGATGMTTGCGGCTGCGCTTCGATTTGCCGGTGCTTCTGTCTGT GFFLGAAPETSLSVLARFDELRLRFDLPVLLSV8501TTCGCGCM;TCCTTTCTGaGCGCGCTCACAGGCCGTGGiCCGGGGGTGTCGGGGCCGC~ACACTCGCTGCAGAGCTTG~CGCCGCCGCAGGTGGAGCT~ SRKSFLRALTGRGPGVSGPRHSLQSLPPPQVEL8601ACTTCATCCGCACACACGAGCCGCGCCCCTTGCGCGACG~GCTGGCGGT~TTGGCGGCG~TG~G~~CGC~GAATiCGTT TSSAHTSRAPCATGWRYURR*Fig. 1. Complete mRNA-like sequences protein predicted instances sequences strand, in the 5’-to-3’nucleotidesequence The positions sequenceof RSFlOlODNA,numberedfrom the center RSFlOlO proteins,of the uniqueHpaI cleavagesite. Only the amino acid thecoding for all but the last of the eleven identified of the ORFs corresponding in the middleA, B, B’, C, D, E, F, G, H, I and K, is shown are indicated in reading determined by the encoded phase.orientation.to these eleven proteins changebelow the nucleotide amino acids designate to which Derbyshireline (see also Fig. 2 and Table III). Note that ORF H begins with a TTG start codon and that of ORF B (at nt 4408) without sequences procedure, Solid lines under that were experimentally for purified proteins mapped (in thoseB’ is translatedby reinitiationthose N- and C-terminalwhere an aa was not unambiguouslyidentified by the sequencingthe residue is underscoredby a dot). The nucleotide oriT and oriV functions Asterisks indicateand Willetts (1987) and Haringand Scherzinger(1989) have previouslyare underlined stop codons.with a dashed line. The major direct (-)and inverted ( --* + ) repeats found in these regions are marked.xaI’1IxIunaax,&YI41IaIw12181 I:, ImaaI;;c,zII I&id?\0strA23 mobCI41516788.684kbmob8 mobA repB repA repCSUIstrB0riVoriTl-11cl clIIII of the RSFlOlO is marked genome.cl0 0El0 qI IOlnnOr to the numbering fragments positions proteins (Ll-L3)cl0 IL2OclL3 Fig. 2. OrganizationDistances(in kb), corresponding RSFlOlO 228). The approximate and orientations RSFlOlO programof Fig. 1, are shown on the top line, promoters the site of in this (arrowsas are the restriction insertion paper are indicated labeled PI throughsites that had been used for subcloning (small triangle, on the second line, as are the locations ORFs, identifiedor pKT228in Ml3 sequencingof Tn3 in pKT228of the genes and DNA sites discussed (shaded readingof the six known E. coli RNA polymerase (open boxes),P6). The coding regions for the eleven identified by the computer (Rl-R3) to their respective ORFs. rightwardboxes labeled A, B, B’, C, D, E, F, G, and genetic possible frames. Dotted lines indicateH, I, and K) and several putative maps as well as assigned variations to identifiedare aligned with the physicaland leftward 280lysates. of cells containing such recombinant plasmids were analyzed by SDS-PAGE. These experiments resulted in the ident~cation of eleven different RSFlOlO proteins that we have designated A, B, B’, C, D, E, F, G, H, I, and K. Eight of these proteins have been purified in the course of this work (Fig. 3). Each of these identified RSFIOIO proteins was assigned to its respective ORF and to its gene in at least one of the following ways: by size determination of the polypeptides synthesized from plasmids containing intact or deletion-harboring target ORFs (Fig. 4, and other experiments not shown), by Nand/or C-terminal amino acid sequence determinations on the purified protein (underlined in Fig. l), by determination of biochemical activity of purified proteins and biochemical analysis of RSFlOlO mutants (Scherzinger et al., 1984; Haring and Scherzinger, 1989). For two of the purified proteins (A and B’), total amino acid composition was also deter-mined and found to be in good agreement with the compositions predicted from the nucleotide sequence of the respective genes (Table II). The locations, orientations and sizes we deduce from the above analyses for protein-coding RSFlOlO regions are summarized in the sequence shown in Fig. 1 and in Table III; a detailed description ofthese regions with reference to the genetic organization of the plasmid is presented in theDISCUSSION.(c) Transcription signals The sequence of six putative RSFlOlO promoters, identified as consensus specific for E. coil RNA polymerase and located at sites previously shown by electron microscopy to bind purified RNA polymerase (Bagdasarian et al., 1981) are shown in Table IV. AI1but one of these promoter sequences are oriented for rightward transcription and all are positioned so that they are capable of directing transcription intoTABLE II Comparison ofresults of amino acid analysis for purified proteins A and B’ with the amino acid compositions predicted from the nucleotide sequence Amino acid Predicted A Asp + Asn Thr Ser Glu + Gin Pro Gly Ala Val Met Be Leu TYr Phe His Lys Arg 20 I 16 28 19 28 39 18 9 13 27 4 9 9 8 19 B’ 28 13 17 46 14 25 44 15 10 8 27 7 6 5 17 34 Analysis” A 20.6 6.5 14.2 31.1 18.2 27.2 38.3 16.0 8.6 11.3 29.6 4.7 9.8 8.8 8.0 20.2 273.1 B’ 27.7 12.5 18.1 44.3 15.7 25.8 46.6 13.3 9.6 8.3 28.1 6.9 5.2 5.3 16.1 32.4 315.968’ACHDFE25.7-o-*_14.4aFig. 3. SDS-PAGE of the eight purified RSFlOlO proteins (A, B, B’, C, D, E, F and H) used for terminal amino acid sequence analysis. Samples of the purified proteins (3-6 1(g)were denatured, reduced, and analyzed on a 17.5% polyacrylamide gel containing 0.1% SDS. A C~massie-staled gel is shown. The leftmost lane contains standard proteins whose molecular masses are given in kDa. In the order of decreasing size these are: DNA polymerase I, bovine serum albumin, ovalbumin, rchymotrypsinogen, egg lysozyme and bovine trypsin inhibitor.273316a Cysteine and tryptophan contents were not determined. The predicted Cys and Trp contents are 1 and 4 in RepA and 2 and 4 in RepB’ , respectively. 281at least one of the eleven identified RSFlOlO coding regions (Fig. 2). Recently, transcript analyses by nuclease S 1 protection as well as assays of promoter activity in vivo and in vitro (Bagdasarian et al., 1987; Derbyshire et al., 1987; E.S., unpublished) have shown that each of these promoters can function inE. coli. They provided no convincing evidence for the existence of other promoters in RSFlOlO DNA. No transcription termination sites have been experimentally identified in the RSFIOIO sequence so far. Examination of the sequence in the regions where termination sites would be expected revealed a66.266.266.2”45.0-45.Q-,,31.0-2621.5-wB&u27-21.521.514.4-14.4-f4.4-10promoter-directed expression ofthe RSFlOlO ORF G, H, I and K proteins. Conditions for cell growth and specific labeling Fig. 4. ‘I? Q, of proteins made under direction of the T7 $10 promoter were essentially as described by Tabor and Richardson (l985). &ells of HBlOl [pGPI-21 transformed with the indicated piasmids (see Table I for details) were grown at 30°C in M9 medium supplemented with thiamine (20 &ml) and a mixture of all 20 common aa except cysteine and methionine (200 pg/ml each). When the cell density reached 1 x lOs/ml (A59o= 0.2), cultures were heated to 42°C to induce synthesis of T7 RNA polymerase. After 20 min of reaction at 42”C, rifampicin was added to 200 ggimi, and after an additional 15 min at 42”C, the cells were grown for 20 min at 30°C. A l-ml sample of each culture was then pulse-labeled with 10 pCi of L-[35S]methionine (800 nCi/mmol) for 5 min. The labeled cells were collected by centrifugation, resuspended in 60 ~1 of loading buffer (50 mM Tris HCl pH 6.8/l % SDS/l Y02-mercapt~th~ol/lO% glycero1/0.025~~ bromphenol blue), heated for 3 min at 95°C and loaded onto a 17.5% polya~~lamide gel containing 0.1% SDS. Following electrophoresis, the gels were tirst stained with Coomassie blue, then dried and autoradiographed for two days. The resulting autoradiograms are shown, with the migration position of the mass standards used to estimate polypeptide sizes on the gels indicated at IeR in kDa. In order of decreasing size they are: bovine serum albumin, ovalbumin, carbonic anhydrase, trypsin inhibitor, egg iysozyme, and cyanogen bromide peptide II from sperm-whale myoglobin (8.2 kDa). The arrows labeled 26,27,28 and 10 indicate insert-specific 3SS-labeled bands with molecular masses (in kDa) corresponding to products of ORFs G, H, I and K, respectively. Shorter pofypeptides derived from two ofthese ORFs, G and H, are apparent for plasmids pOT784 and pOT745, in which the 3’ end of the respective coding sequences is deleted (see legend to Table I). The 54-aa fragment of the ORF K protein, predicted to be expressed from pOT762d2, is too small to be detected in the gel system used. 282 TABLE III Location of known protein-coding regions in RSFlOlO and their assignment to specific genes Protein Position of coding sequence 63- 863 866-70 98-76 54-26 75-8660 Translational initiation signals a No. of aab GeneH I K B D B’ E F A C GGAAAACUGAAGGAACCUCCAUUG GACCCUCUGACUUGGGGUUGAUG AUUAACAAUGGGGUGUCAAGAUG AAAAGCAACAGCGAGGCAGCAUG AUACCGGGAGGCAAUAGACCAUG UGGCUGCGCUUGGUWCCCGAUG CACGCACAGAAGGGGGUUUUAUG CAAGUUUCUCGGUGACUGAUAUG GCCUGCAAAGGAGGCAAUCAAUG CAAGAGCAAGGGGGUGCCCCGUG UUAUUAUUAUAGAAGCCCCCAUG267 278 94 709 137 323 70 68 279 283 262
896 SWA strB mobC mobA/repB mobs repB (repB’)cmrepA repC sula The start codons are underlined as are the nucleotides in the pre-start region that could pair with the 3’ end of E. coli 16s rRNA (AUUCCUCCACU...S’). b Protein sizes are calculated from the nucleotide sequence and include the N-terminal methionine although this residue is known to be absent from the purified proteins A, B and C.TABLE IV Location of promoter sequences in RSFlOlO (from left to right in serial order)” Serial No. Position a Promoter sequence b -35 t cTTGACa L R R 7839R -10 tg.TAtAaTGTTGCTCCCCTTAACCATCTTGACACCCCATmTTAATG1GCTGTCTCGT AGGGTCGGGATTGCcGCCGCTGTGCCTC.CATGATAGCCTACGAGACAGC TGACACCCCATTGTT4ATGTGCTGTCTCGTAGSCTATCATGGAGGCACAG CATGTAGTGCTTGCGTTGGTACTCACG..CCTGTTATACTATGAGTACTC(or) CATGTAGTGCTTGCGTTGGTACTCACGCCTGTTATACTATGAGTACTCAC CATAGGCCGmTCcTGGCTTTGCTTCC.AGATGmGCXCTTCTGCTCC CGCCTGGGGATTCCCTTTCGACCCGAGC.ATCCGTATGATACTCATGCTCa The position of the last (3’) T of the -10 region is given since the exact position of the first nt of the RNA chain is not known for any of these promoters. R indicates the rightward and L the leftward transcription on the conventional map (Fig. 2). b The evidence that these sequences have promoter activity is given in RESULTS, section c and DISCUSSION, sections a-c. The base sequences are for the DNA strand that has the same sequence as the mRNA produced from each promoter. The consensus sequences for the -35 and -10 region (indicated at the top of the table) are those for the known promoters for E. coli RNA polymerase (Hawley and McClure, 1983); homologous nucleotides in the RSFlOlO sequences are underlined.G + C-rich dyad symmetry region followed by a T-rich sequence, located at the end of the rep operon (nt positions ), which may be a Rho-independent termination signal. (d) Codon usage Table V shows the codons used in the regions of RSFlOlO DNA that code for proteins. As may bededuced from this table, no uniform codon usage can be detected in the RSFlOlO genome as a whole. However, the codons used in the genes involved in plasmid replication/maintenance and mobilization (A, B/B’, C, D, F, and K) are strikingly similar in that there is a strong bias for codons ending in either a G or C residue in every case where redundancy exists (except perhaps for Tyr). This is different from the relative codon usage in the resistance genes (G, H, 283TABLEV of codon usage for the eleven RSFlOlO A B 1110Distribution aagenes C D 1 12 12 1 1 2 2 2 10 1 3 1 2 2 8 2 3 8 2 6 6 2 3 10 11 4 1 2 2 1 1 7 2 10 2 17 2 2 1 10 3 2 2 2 3 5 2 4 7 3 1 2 4 1 2 2 4 2 1 1 3 11 8 1 2 2 1 1 4 1 1 5 4 1 1 2 1 2 6 1 3 1 2 5 5 3 2 1 3 2 1 3 1 2 1 2 3 5 10 1 7 9 14 6 2 7 6 6 1 5 7 2 1 7 5 6 1 2 2 4 6 1 4 8 4 7 8 5 8 6 2 4 10 2 5 6 6 1 4 4 7 2 3 4 2 1 1 1 1 3 2 1 2 1 2 1 2 4 7 9 10 13 7 3 5 6 4 1 6 3 6 3 6 5 2 3 4 4 5 6 5 2 1 2 1 3 3 3 1 1 2 4 1 1 2 F 1 4 K 2 5 4 1 2 1 5 6 3 12 E G 1 2 13 7 4 H 6 11 2 3 1 1 3 7 5 6 1 8 5 2 8 3 4 2 1 3 8 3 11codon CGA CGC CGG CGT AGA AGGB’ 2 14 12 1 5 9 8 6 3 16 2 1 1 19 2 1 9 3 1 3 5 5 1 6 29 7 2 17 3 5 1 6 8 1 6 1 1 16 5 2 21 14 1Arg6 1 2 3 1 22 1 2 3 1 10 5 2 1 9 8 1 6 25 5 3 14 6 8 1 4 12 1 10 3 1 7 5 1 2 9 ---29 23 2 2 9LeuCTA CTC CTT CTG TTA TTG4 2 20 -30 2 3 1 4 2SerTCA TCC TCG TCT AGC AGTThrACA ACC ACG ACT18 5 3 8 9 14 2 18 51 17 16 5 42 8 15 1 11 13 4 2 18 3 2 26 13 5 3 35ProCCA ccc CCG CCTAlaGCA GCC GCG GCTGlyGGA GGC GGG GGT19 6 4 2 7 10 -ValGTA GTC GTG GTTIleATA ATC ATTLYS AsnAAA AAG AAC AATGlnCAA CAG8 284 TABLE V (continued) of codon codon CAC CAT Glu GAA GAG Asp GAC GAT TY~ TAC TAT cys TGC TGT Phe TTC TTT Met Trp Total ATG TGG 10 4 279 9 usage for the eleven RSFlOlO A 5 4 7 10 10 4 3 1 1 B 13 8 21 39 35 12 8 8 2 1 9 3 18 8 709 B’ 4 1 8 17 16 5 3 4 genes c 6 2 6 13 11 4 4 3 3 2 4 1 4 5 283 6 6 137 3 1 68 1 4 1 94 2 9 4 1 1 1 1 2 3 2 1 70 D 1 F K 1 3 3 7 6 4 2 4 1 E G 3 2 3 1 4 1 2 2 1 6 6 5 2 262 6 2 2 3 8 9 2 5 1 I 5 6 6 261 H 2 3 8 11 11 11 I 4 4 10 6 12 12 5 4 1 2 4 5 10 5 278Distribution aa His--I1 4 2 11 4 323% G + C content at the three positions in the codons used: 1st position 2nd position 3rd position 70.7 47.5 82.9 12.1 48.9 76.2 69.1 47.2 83.6 68.3 46.8 82.7 63.2 47.8 84.1 69.6 44.9 75.4 68.4 41.1 73.7 49.3 46.5 64.8 66.2 52.5 66.2 67.9 45.9 54.8 67.4 42.3 58.1and I) and in gene E (whose biological function is unknown), which do not show such a consistent pattern. Overall, 78.3% of the codons in the genes involved in replication/maintenance and mobilization have G or C in the third position compared with 6 1.Oy0 of the codons in genes E, G, H and I (for individual values see bottom part of the table). Moreover, gene E differs from all other genes by its low G + C content in the first codon position (49.3 y0 vs. 63.2-72.1% for the other genes). The similarity of codon usage for both the replication/maintenance and mobilization genes and the marked difference in codon usage for gene E and the three resistance genes suggest that the latter genes have been incorporated more recently in the evolution of the RSFlOlO plasmid.DISCUSSIONRegions involved in plasmid replication and its regulation(a)For replication of RSFlOlO DNA in E. coli, the origin of vegetative replication, oriV, and the products of three genes, repA, repB and repC, are required (Scherzinger et al., 1984). Purified replication protein RepA was found to contain three activities: (1) an ATPase activity which is stimulated by ssDNA, (2) an ATP-dependent ssDNA-binding activity, and (3) a hehcase activity. The latter activity catalyzes the unwinding of extensive stretches of dsDNA in a reaction requiring ATP hydrolysis (Haring and Scherzinger, 1989). The repC gene codes for a basic protein having Val as the N-terminal amino acid. The 5’-terminal portion of repC overlaps with 14 nt of the 3’-terminal part of rep4 gene. Messenger RNA of this region can form a loop which encloses the ribosome-binding site and the start codon of repC in a ds structure of 285considerable stability. The instability of recombinant plasmids to express repC gene at high levels unless repA gene is also expressed supports our proposal that repC is regulated at the translational, as well as transcriptional level (Haring and Scherzinger, 1988; Frey and Bagdasarian, 1989). Purified RepC protein was shown to be a dsDNA-binding protein (Haring et al., 1985; Haring and Scherzinger, 1988). Its primary binding site lies within the otiV region (Haring et al., 1985). Sequences that function as the RSFlOlO-specific origin of vegetative replication, otiV, are located between nt 2347 and 2742 (Fig. 1; Haring and Scherzinger, 1989). The most prominent features of the c&V region is the presence of three identical 20-bp direct repeats, that are essential for the origin function. Strong evidence indicates that they are the sites for binding ofthe initiator RepC protein (Haring et al., 1985). This binding is believed to be the reason for strong incompatibility effect exerted by the direct repeats cloned onto multicopy vectors (Persson and Nordstrom, 1986). The direct repeats are followed by 28 bp of G + C-rich sequence (86% G + C) and by 3 1 bp of A + T-rich sequence (7 1% A i T). The oriV region ends with a 152 bp (nt ) segment of extensive dyad symmetry, which is also essential for origin function. These two essential domains are separated by a non-essential region, marked by the Hue11 fragment at nt positions 251 l-2524, which may be deleted without visible dete~oration of the oriV function (Haring and Scherzinger, 1989). By using a series of ss M13-RSFlOlO chimeras as templates in an in vitro DNA replication system, it was shown that the palindromic region of oriV contains two unique sequences, one on each strand, capable of acting as on’ for complementary-strand DNA synthesis. These sequences were named or& (nt positions ) and oriR (nt ). In contrast to the observation with duplex oriV DNA, the function of oriR and or& on ssDNA templates is no longer dependent on the RSFlOfO RepA and RepC however, it still depends on RepB’, indicating that this protein is directly responsible for primer synthesis at the RSFlOlO origin (Haring and Scherzinger, 1988). The same regions have recently been found to function as SSDNA initiation sites in vivo when inserted into mutant Ml3 phage genomes (Honda et al., 1988).Both proteins RepB and RepB’ are capable of primer synthesis at the RSFlOlO origin in vitro (Haring and Scherzinger, 1989). However, deletions or insertions in the ORF for RepB, which leave the ORF of RepB’ intact, do not affect the ability of RSFlOlO to replicate in E. coli (Bagdas~~ et al., 1982; Scherzinger et al., 1984). Genes located between repB and repA encode two small proteins, E and F. The latter, a dimer of 68-aa polypeptides, was shown to bind to the operator region of the P4 promoter (unpublished). This protein presumably represents the repressor that regulates expression of its own gene as well as that of repA and repC thus regulating the initiation frequency at theotiV andthecopynumberofRSFlOl0 (Haring et al., 1985; Frey and Bagdasarian, 1989). We propose therefore to call the gene specifying the F protein cat (for control of repA and C genes).(b) Region for conjugative mobiIiz~tionThree genes, mobA, mobB and mobC were shown to be required, besides oriT, the origin of conjugational DNA transfer, for mobilization of RSFlOlO in the presence of conjugative plasmids (Derbyshire and Willets, 1987; Derbyshire et al., 1987). This study has shown that their products correspond to proteins B, D and K, respectively. A comparison of the data of Derbyshire et al. (1987) with the results of this work shows that ORF B, specifying the MobA protein, is identical to that of the gene repB. It has been previously established that reps and repB’ (translated as the 3’-terminal coding region of repB) specify RSFlOlO primases (Haring and Scherzinger, 1989). We therefore conclude that protein RepB has two functional domains: the C-terminal domain, exhibiting primase activity, essential for plasmid replication, and the N-terminal domain, active in plasmid mobilization, most probably as a component of the relaxation complex (E.S., unpublished). Gene mob& which overlaps with mobA, is not translated in the same phase with it. Gene mobC is the only gene, identified so far, that is transcribed from right to the left on the map shown in Fig. 2. The three mob genes are therefore transcribed divergently by promoters PI, P2 and P3 (Fig. 2; Table IV). 286(c)Antibiotic-resistanceregionTn5In the region specifying resistance to antibiotics three ORFs, G, H and I, were identified. They are presumably P6, arranged the sequence mapping RSFlOlO of transcribed in tandem RNA by two promoters, and identified P5 and of on by analysis sites(Fig. 2; Table IV) as well as by direct polymerase-binding et al., 1981). from the P5 and to Su. Mutants confer Su by aminonucleotide numberDNA (BagdasarianThe first ORF, G, downstream P6 promoters, specifies resistance with truncated resistance ORFG can no longerto their hosts. As is indicatedacid homology, the ORF G translation product is presumably a dihydropteroate synthase, similar to that encoded on the chromosome of Streptococcus pneumoniae (Lopez et al., 1976). In the region coding for Sm resistance we have identified two ORFs, H and I. The ORF H translation product exhibits Sm phosphotransferase activity in vitro. Its nucleotide sequence exhibits significant homology with that of the TnS-encoded Km phosphotransferase gene (Fig. 5). In addition, similarities between the sequences of ORF I and the str gene of Tn5 were found. Deletions at the start or the end of either ORF H or ORF I resulted in reduced Sm resistance of the host cells. The expression of both ORFs is thus required for the high-level of Sm resistance expressed by the wt RSFlOlO (E.S. and P.S., unpublished). Until the enzymes specified by ORFs G, H and I are properly classified by biochemical analysis, we propose to name the respective genes sul, strA and strB (as indicated in Fig. 2). (d) Concluding remarks The fact that some of the ORFs of RSFlOlO produce proteins when appropriate DNA fragments are inserted into expression vectors does not prove that these proteins are expressed by wt RSFlOlO in the cell. However, biochemical experiments on the in vitro RSFlOlO replication system (Scherzinger et al, 1984; Haring et al., 1985; Haring and Scherzinger, 1989) as well as genetic studies on the mutants of RSFlOlO (Scholz et al., 1984; Bagdasarian et al., 1987; Frey and Bagdasarian, 1989) strongly suggest that at least those proteins purified in the course of this study are expressed by the native plas-Fig. 5. Dot matrix nucleotide resistance sequences zodier UWGCG cy = 17. sequenceanalysisshowingsimilarities numbers marginsbetweentheofthe RSFlOlOORFs H and I and the Tti are shown for and the coding from MaThe therein.genes km and str. Nucleotide on the corresponding and from are boxed. programs with The Tn5 sequenceeach sequencewas taken citedet al. (1985)references of window(Devereux conditionset al., 1984) used were ‘Com= 28 and stringen-pare/Dotplot’mid and are functional regulation.in its replication,transfer andDetermination of the complete nucleotide sequence of RSFlOlO, purification of proteins specified by the plasmid and analysis of their function resolved several problems of practical importance and scientific interest: (I) it has provided a detailed structure of the plasmid used as replicon for one of the most versa (2) it helped to dissect and understand the mechanism of RSFlOlO (3) it explained, at least in part, the molecular basis for the wide host range exhibited by this plasmid. Replication proteins of RSFlOlO provide the most essential functions of the DNA replication complex, the primosome, such as recognition of the origin (RepC), unwinding of the DNA duplex (RepA) and primer synthesis (RepB and RepB’). The plasmid is therefore independent of these functions encoded by the host chromosome, as has been shown experimentally at least in E. coli (Scholz et al., 1984). We postulated that this independence allows RSFlOlO to replicate in a wide range of Gram-negative bacterial hosts. Our information on the wide host 287range of RSFlOlO, already exploited in biotechnology (Bagdasarian et al., 1983; Fttrste et al., 1986), has been increased recently by the observation that this plasmid may be transferred to plant cells from Agrobacterium tumefaciens (BuchananWollaston et al., 1987). The overlapping of regulatory and coding sequences in RSFlOlO shows an unusual economy in genetic organization. At present nearly 90% of its genome may be identified as information carrier for proteins or signal structures. Further work will be needed to fully understand the mechanisms of RSFlOlO replication, ofits regulation that ascertains the m~nten~ce of the plasmid at a specific number of copies in the cell and of the precise relation of these processes to the wide host range properties.ACKNOWLEDGEMENTSWe thank H. Schuster for support and continuous encouragement. The expert assistance of Marion Wassil, Sabine Otto, and Norbert Voll is gratefully acknowledged. A grant from Research Excellence Fund of the State of Michigan (to M.B.) is gratefully acknowledged.REFERENCES Amann, E., Brosius, J. and Ptashne, M.: Vectors bearing a hybrid trp-luc promoter useful for regulated expression of cloned genes in Escherichiu coli. Gene 25 (8. A&man, K. and Bosserhoff, A.: Amino acid analysis by high performance liquid chromatography and precolumn derivatisation. In Tschesche, H. (Ed.), Modern Methods in Protein Chemistry, Walter de Gruyter, Berlin, 1985, pp. 155-171. Bagdasarian, M., Lurz, R., Riickert, B., Franklin, F.C.H., Bagdasarian, M.M., Frey, J. and Tim&s, K.N.: Specific purpose plasmid cloning vectors, II. Broad-cost-range, high-copynumber, RSFlOlO-derived vectors, and host vector system for gene cloning in Pseudomonas. Gene 16 (7. Bagdasarian, M., Bagdasarian, M.M., Lurz, R., Nordheim, A., Frey, A. andTimmis, K.N.: Molecular and functional analysis of the broad host range plasmid RSFlOlO and construction of vectors for gene cloning in Gram-negative bacteria. In Mitsuhashi, S. (Ed.), Bacterial Drug Resistance. Japan Scientific Society Press, Tokyo, 1982, pp. 183-197. Bagdasarian, M.M., Amann, E., Lurz, R., Riickert, B. andBagdasarian, M.: Activity of the hybrid trp-lac (tat) promoterof Escherikhia coli in Pseudomonas putida. Construction of broad-host-range, controlled-expression vectors. Gene 26 (2. Bagdasarian, M.M., Scholz, P., Frey, J. and Bagdasarian, M.: Regulation of the rep operon expression in the broad host range plasmid RSFlOlO. In Novick, R. and Levy, S. (Eds.), Evolution and Environmental Spread of Antibiotic Resistance Genes. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1987, pp. 209-223. Biggin, M.D., Gibson, T.J. and Hong, G.F.: Buffer gradient gels and 35S label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80 (-3965. Boyer, H.W. and Roulland-Dussoix, D.: A complementation analysis of the restriction and modification of DNA in Escherichia coli. J. Mol. Biol. 41 (2. Brosius, J. and Holy, A.: Regulation of ribosomal RNA promoters with a synthetic lac operator. Proc. Natl. Acad. Sci. USA 81 (-6933. Buchanan-Wollaston, V., Passiatore, J.E. and Cannon, F.: The mob and oriT mobilization functions of a bacterial plasmid promote its transfer to plants. Nature 328 (5. De Graaf, J., Crosa, J.H., Heffron, F. and Falkow, S.: Replication of the non-conjugative plasmid RSFlOlO in ~sche~chja coli K-12. J. Bacterial. 134 (-l 122. Derbyshire, K.M. and Willets, N.: Mobilization of the non-conjugative plasmid RSFlOlO: a genetic analysis of its origin of transfer. Mol. Gen. Genet. 206 (0. Derbyshire, K.M., Hatfuli, G. and Willets, N.P.: Mobilization of the non-conjugative plasmid RSFlOlO: a genetic and DNA sequence analysis of the mobilization region. Mol. Gen. Genet. 206 (8. Devereux, I., Haeberli, P. and Smithies, 0.: A comprehensive set of sequence programs for the VAX. Nucleic Acids Res. 12 (5. Frey, J. and Bagdasarian, M.: The molecular biology of IncQ plasmids. In Thomas, C. and Franklin, F.C.H. {Eds.), Molecular Biology of Broad Host Range Plasmids. Academic Press, London, 1989, in press. Fiirste, J. P., Pansegrau, W., Frank, R., Blocker, H., Scholz, P., Bagdasarian, M. and Lanka, E.: Molecular cloning of the plasmid RP4 primase region in a multi-host-range tacP expression vector. Gene 48 (1. Garoff, H. and Ansorge, W.: Improvement of DNA sequencing gels. Anal. Biochem. 115 (7. Guerry, P., Van Embden, J. and Falkow, S.: Molecular nature of two nonconjugative plasmids carrying drug resistance genes. J. Bacterial. 117 (0. Haring, V. and Scherzinger, E.: The replication proteins of IncQ plasmid RSFiOlO. In Thomas, C. and Franklin, F.C.H. (Eds.), Molecular Biology of Broad Host Range Plasmids. Academic Press, London, 1989, in press. Haring, V., Scholz, P., Scherzinger, E., Frey, J., Hatfull, G., Willets, N.W. and Bagdasarian, M.: Protein RepC is involved in copy number control of the broad host range plasmid RSFlOlO. Proc. Natl. Aead. Sci. USA 82 (-6094. Hawley, D.K. and McClure, W.R.: Compilation and analysis of Escherichiu coli promoter DNA sequences. Nucleic Acids Res. I1 (-2255. 288 Heffron, F., Rubens, C. and Falkow, S.: Translocation of a plasmid DNA sequence which mediates ampicillin resistance: molecular nature and specificity of insertion. Proc. Natl. Acad. Sci. USA 72 (-3627. Honda, Y., Sakai, H. and Komano, T.: Single-strand DNA initiation signals located in the oriV region of plasmid RSFlOlO. Gene 68 (8. Kushner, S.R.: An improved method for transformation of Escherichia coli with ColEl derived plasmids. In Boyer, H.W. and Nicosia, S. (Eds.), Genetic Engineering, Elsevier, Amsterdam, 1978, pp. 17-23. Lopez, P., Espinoza, M., Greenberg, B. and Lacks, S.A.: Sulfonamid resistance in Streptococcus pneumoniae. DNA sequence of the gene encoding dehydropteroate synthase and characterization of the enzyme. J. Bacterial. 169 (-4326. Lugtenberg, B., Meijers, J., Peters, R., Van der Hoek, P. and Van Alphen, L.: Electrophoretic resolution of the ‘major outer membrane protein’ of Escherichiu coli K-12 into four bands. FEBS Lett. 58 (8. Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982. Mazodier, P., Cossart, P., Giraud, E. and Gasser, F.: Completion of the nucleotide sequence of the central region of Tn.5 confirms the presence of three resistance genes. Nucleic Acids Res. 13 (5. McKnight, G.S.: A calorimetric method of the determination of submicrogr~ quantities of protein. Anal. Biochem. 78 (. Messing, J.: New Ml3 vectors for cloning. Methods Enzymol. 101 (. Nordheim, A., Hashimoto-Gotoh, T. and Timmis, K.N.: Location oftwo relaxation sites in R6K and single sites in pSClO1 and RSFIOIO close to origins of vegetative replication: implication for conjugational transfer of plasmid DNA. J. Bacteriol. 144 (2. Persson, C. and Nordstrom, K.: Control of replication of the broad host range plasmid RSFlOlO: the incompatibility determinant consists of directly repeated DNA sequences. Mol. Gen. Genet. 203 (2. Poncs, M., Solowiejczyk, D., Ballantine, M., Schwartz, E. and Surrey, S.: Nonrandom DNA sequence analysis in bacteriophage Ml3 by the dideoxy chain-termination method. Proc. Natl. Acad. Sci. USA 79 (-4302. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (-5467. Scherzinger, E., Bagdasarian, M.M., Scholz, P., Lurz, R., Riickert, B. and Bagdasarian, M.: Replication of the broad host range plasmid RSFIOIO: requirement for three plasmidencoded proteins. Proc. Natl. Acad. Sci. USA 81 (8. Scholz, P., Haring, V., Scherzinger, E., Lurz, R., Bagdasarian, M.M., Schuster, H. and Bagdasarian, M.: Replication determinants of the broad-host-range plasmid RSFlOlO. In Helinski, D. R., Cohen, S. N., Clewell, D. B. and Jackson, D. A. (Eds.), Plasmids in Bacteria. Plenum, New York, 1984, pp. 243-259. Tabor, S. and Richardson, CC.: A bacteriophage T7 RNA polymerase/promoter system for controlled exclusive expression of specific genes. Proc. Natl. Acad. Sci. USA 82 (-1078. Wittmann-LieboId, B. and Ashman, K.: On-line detection of amino acid derivatives released by automatic Edman degradation of polypeptides. In Tschesche, H. (Ed.), Modern Methods in Protein Chemistry. Walter de Gruyter, Berlin, 1985, pp. 303-327.
更多搜索:
All rights reserved Powered by
文档资料库内容来自网络,如有侵犯请联系客服。

我要回帖

更多关于 拍案叫绝的近义词 的文章

 

随机推荐