{"id":3289,"date":"2020-11-27T08:00:23","date_gmt":"2020-11-27T08:00:23","guid":{"rendered":"https:\/\/uni.hi.is\/eirikur\/?p=3289"},"modified":"2020-11-26T11:49:05","modified_gmt":"2020-11-26T11:49:05","slug":"islenskar-ordtidnirannsoknir","status":"publish","type":"post","link":"https:\/\/uni.hi.is\/eirikur\/2020\/11\/27\/islenskar-ordtidnirannsoknir\/","title":{"rendered":"\u00cdslenskar or\u00f0t\u00ed\u00f0niranns\u00f3knir"},"content":{"rendered":"<p>Stundum veltir f\u00f3lk \u00fev\u00ed fyrir s\u00e9r hver s\u00e9u algengustu or\u00f0in \u00ed \u00edslensku. \u00dev\u00ed er n\u00fana h\u00e6gt a\u00f0 svara, en \u00e1\u00f0ur \u00fearf samt a\u00f0 \u00e1tta sig \u00e1 \u00fev\u00ed a\u00f0 or\u00f0i\u00f0 <strong>or\u00f0<\/strong> er margr\u00e6tt \u2013 hefur a.m.k. \u00ferj\u00e1r merkingar sem h\u00e9r skipta m\u00e1li. \u00cd fyrsta lagi er merkingin \u2018uppflettior\u00f0\u2019 sem felur \u00ed s\u00e9r grunnmynd (uppflettimynd, or\u00f0ab\u00f3karmynd) or\u00f0sins og allar beygingarmyndir \u00feess. Undir uppflettior\u00f0i\u00f0 <em>eiga<\/em> falla \u00feannig auk uppflettimyndarinnar beygingarmyndir eins og <em>\u00e1<\/em>, <em>eigum<\/em>, <em>\u00e1tti<\/em>, <em>\u00e6tti<\/em>, o.s.frv. En myndin <em>\u00e1<\/em> getur au\u00f0vita\u00f0 l\u00edka veri\u00f0 forsetningin <em>\u00e1<\/em>, nafnor\u00f0i\u00f0 <em>\u00e1<\/em>, og beygingarmynd af nafnor\u00f0inu <em>\u00e6r<\/em>.<\/p>\n<p>\u00d6nnur merking or\u00f0sins er svo \u2018or\u00f0mynd\u2019 \u2013 tiltekinn stafa- e\u00f0a hlj\u00f3\u00f0astrengur, \u00f3h\u00e1\u00f0 m\u00e1lfr\u00e6\u00f0ilegri greiningu. \u00deannig er <em>\u00e1<\/em> ein og sama or\u00f0myndin hvort sem h\u00fan tilheyrir s\u00f6gninni <em>eiga<\/em>, forsetningunni <em>\u00e1<\/em>, nafnor\u00f0inu <em>\u00e1<\/em> e\u00f0a nafnor\u00f0inu <em>\u00e6r<\/em>. \u00deri\u00f0ja merkingin er svo \u2018lesm\u00e1lsor\u00f0\u2019 \u2013 or\u00f0 \u00ed texta. \u00deetta er s\u00fa merking sem \u00e1 vi\u00f0 \u00feegar lengd texta er tilgreind \u2013 ef sagt er t.d. a\u00f0 ritger\u00f0 eigi a\u00f0 vera \u00fe\u00fasund or\u00f0 merkir \u00fea\u00f0 ekki a\u00f0 \u00ed henni eigi a\u00f0 koma fyrir \u00fe\u00fasund mismunandi lesm\u00e1lsor\u00f0 e\u00f0a or\u00f0myndir, heldur a\u00f0 fj\u00f6ldi lesm\u00e1lsor\u00f0a eigi a\u00f0 vera \u00fe\u00fasund. \u00de\u00e1 er hver or\u00f0mynd talin \u00ed hvert skipti sem h\u00fan kemur fyrir.<\/p>\n<p>T\u00f6lvur hafa gert \u00fea\u00f0 \u00e1kaflega au\u00f0velt a\u00f0 telja fj\u00f6lda lesm\u00e1lsor\u00f0a og or\u00f0mynda. Talning uppflettior\u00f0a er hins vegar miklu fl\u00f3knari vegna \u00feess a\u00f0 h\u00fan krefst m\u00e1lfr\u00e6\u00f0ilegrar greiningar, til a\u00f0 h\u00e6gt s\u00e9 t.d. a\u00f0 fella d\u00e6min um or\u00f0myndina <em>\u00e1<\/em> undir r\u00e9tta uppflettimynd \u00fat fr\u00e1 setningafr\u00e6\u00f0ilegri st\u00f6\u00f0u \u00feeirra og stundum l\u00edka merkingu. Sl\u00edka greiningu \u00feurfti til skamms t\u00edma a\u00f0 gera \u00ed h\u00f6ndunum, en n\u00fa er kominn greiningarhugb\u00fana\u00f0ur (fyrir \u00edslensku t.d. <a href=\"http:\/\/nlp.cs.ru.is:8080\/IceNLPWeb\/icenlp_isl.html\">IceNLP<\/a> og <a href=\"https:\/\/greynir.is\/analysis\">Greynir<\/a>) sem greinir or\u00f0in v\u00e9lr\u00e6nt. Sl\u00edk greining ver\u00f0ur aldrei fullkomlega r\u00e9tt, en \u00fe\u00f3 yfirleitt n\u00f3gu n\u00e1kv\u00e6m til a\u00f0 h\u00fan gagnist til flestra \u00fearfa.<\/p>\n<p>Fyrstu st\u00f3ru t\u00ed\u00f0nik\u00f6nnun \u00e1 \u00edslenskum textum ger\u00f0i \u00c1rs\u00e6ll Sigur\u00f0sson sk\u00f3lastj\u00f3ri og <a href=\"https:\/\/timarit.is\/page\/4552984#page\/n9\/mode\/2up\">birti<\/a> ni\u00f0urst\u00f6\u00f0ur s\u00ednar \u00ed <em>Menntam\u00e1lum<\/em> \u00e1ri\u00f0 1940. Tilgangur hennar var hagn\u00fdtur; \u201ea\u00f0 finna lei\u00f0 til a\u00f0 gera stafsetningarkennsluna a\u00f0gengilegri og raunh\u00e6fari en \u00e1\u00f0ur, en \u00fe\u00f3 v\u00e6nlegri til betri \u00e1rangurs\u201c. Textarnir voru \u00far st\u00edlum barna, sendibr\u00e9fum fullor\u00f0inna, lesb\u00f3kum, n\u00e1tt\u00farufr\u00e6\u00f0i, s\u00f6gu og landafr\u00e6\u00f0i, alls um 100 \u00fe\u00fasund lesm\u00e1lsor\u00f0. En vanda\u00f0asta <a href=\"https:\/\/clarin.is\/gogn\/ifd\/\">t\u00ed\u00f0nik\u00f6nnun<\/a> \u00e1 \u00edslenskum textum var unnin hj\u00e1 Or\u00f0ab\u00f3k H\u00e1sk\u00f3lans en ni\u00f0urst\u00f6\u00f0ur hennar birtust \u00ed <em>\u00cdslenskri or\u00f0t\u00ed\u00f0nib\u00f3k<\/em> 1991. H\u00fan bygg\u00f0ist \u00e1 um 500 \u00fe\u00fasund lesm\u00e1lsor\u00f0um \u00far fimm mismunandi textategundum.<\/p>\n<p>\u00c1 \u00e1runum 2004-2012 var komi\u00f0 upp hj\u00e1 \u00c1rnastofnun miklu <a href=\"https:\/\/clarin.is\/gogn\/mim\/\">textasafni<\/a>, <em><a href=\"http:\/\/mim.arnastofnun.is\/\">Marka\u00f0ri \u00edslenskri m\u00e1lheild<\/a><\/em> \u2013 samtals um 25 millj\u00f3nir lesm\u00e1lsor\u00f0a. Textarnir eru margfalt fj\u00f6lbreyttari en \u00ed <em>\u00cdslenskri or\u00f0t\u00ed\u00f0nib\u00f3k<\/em>, og skiptast \u00ed <a href=\"http:\/\/malfong.is\/files\/mim_textaflokkar_is.pdf\">23 flokka<\/a>. Fr\u00e1 2005 hefur svo <em><a href=\"https:\/\/malheildir.arnastofnun.is\/?mode=rmh2019#?lang=is-is&amp;stats_reduce=word&amp;isCaseInsensitive&amp;searchBy=word&amp;cqp=%5B%5D\">Risam\u00e1lheild<\/a><\/em> veri\u00f0 bygg\u00f0 upp hj\u00e1 \u00c1rnastofnun og hefur n\u00fa a\u00f0 geyma 1,64 milljar\u00f0a lesm\u00e1lsor\u00f0a \u00far \u00fdmsum \u00e1ttum. \u00c1 henni byggist <em><a href=\"http:\/\/ordtidni.arnastofnun.is\/\">Or\u00f0t\u00ed\u00f0nivefur \u00c1rnastofnunar<\/a><\/em> sem n\u00fa tekur til t\u00e6plega 1,4 milljar\u00f0s lesm\u00e1lsor\u00f0a. Vegna st\u00e6r\u00f0ar og fj\u00f6lbreytni \u00feessara safna m\u00e1 \u00fev\u00ed vinna \u00far \u00feeim \u00e1byggilegar uppl\u00fdsingar um \u00edslenska or\u00f0t\u00ed\u00f0ni. \u00de\u00f3 er st\u00f3r galli a\u00f0 \u00ed \u00feeim er mj\u00f6g l\u00edti\u00f0 af talm\u00e1li.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Stundum veltir f\u00f3lk \u00fev\u00ed fyrir s\u00e9r hver s\u00e9u algengustu or\u00f0in \u00ed \u00edslensku. \u00dev\u00ed er n\u00fana h\u00e6gt a\u00f0 svara, en \u00e1\u00f0ur \u00fearf samt a\u00f0 \u00e1tta sig \u00e1 \u00fev\u00ed a\u00f0 or\u00f0i\u00f0 or\u00f0 er margr\u00e6tt \u2013 hefur a.m.k. \u00ferj\u00e1r merkingar sem h\u00e9r skipta m\u00e1li. \u00cd fyrsta lagi er merkingin \u2018uppflettior\u00f0\u2019 sem felur \u00ed s\u00e9r grunnmynd (uppflettimynd, or\u00f0ab\u00f3karmynd) or\u00f0sins [&hellip;]<\/p>\n","protected":false},"author":141,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[158635],"tags":[],"class_list":["post-3289","post","type-post","status-publish","format-standard","hentry","category-malfar"],"_links":{"self":[{"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/posts\/3289","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/users\/141"}],"replies":[{"embeddable":true,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/comments?post=3289"}],"version-history":[{"count":1,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/posts\/3289\/revisions"}],"predecessor-version":[{"id":3290,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/posts\/3289\/revisions\/3290"}],"wp:attachment":[{"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/media?parent=3289"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/categories?post=3289"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/uni.hi.is\/eirikur\/wp-json\/wp\/v2\/tags?post=3289"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}