Changes to entry structure

At the end of August we changed the way dictionary entries are split up in some of our titles. We are bringing related definitions together which we hope will improve the user experience. Hopefully, this will be a 'silent' improvement but we know our API consumers like to be kept updated. Below we've provided answers to the questions we're anticipating.


Q: What will this mean for users?

A: Users are more likely to find the definition in the first result; there will be fewer but longer entries.

We have found that users (including API consumers) want more information on one page and prefer not to click through search results. So, we are merging the entries which relate to a single headword.

This will mean that users will not have to choose the sense they want to get the definition, all definitions for a single headword will be displayed on one page. In the main British (Advanced Learner's) and American English datasets, definitions of a single word are spread across multiple entries, e.g. searching for "find" begins with the following four results:

find verb (DISCOVER)
find verb (JUDGE)
find noun

In the future, these will be one entry with separate definitions for all of these meanings. NB: many entries already have multiple meanings (look at "find verb (DISCOVER)").

Q: What is a headword?

A: For us, a headword is a word with a particular spelling and pronunciation.

All senses in the same headword will be brought together in one entry. Some words will continue to be separate, for instance record (verb) and record (noun) because they have different pronunciations. We will keep this under review.

Phrasal verbs and idioms will remain separate entries as well.

Q: Will the API change?

A: The API itself will not change, this is a content change.

The XML and HTML of the entries will change (not the XML/JSON or the endpoints). The search results will also change, because there will be fewer entries.

Q: Will this break my application?

A: If your code does not make any assumptions about the structure of entries, it should continue to work.

If you have used references to specific entryIds, those entries may no longer exist - perform a new search and it will be fine. Entries will conform to the same DTD, but if your application reads from the XML structure directly, be advised that the path to senses may include another block element (pos-block) compared to before.

Please let us know if you have any other questions about these exciting changes!

Best wishes,

Cambridge Dictionaries Online team

English Profile Levels

If you're using the API with the dataset 'british', you might be interested to know we've improved the coverage and added a new feature, English Profile Levels to the XML.

Senses may now have 'lvl' tags (usually at 'def-info' level), with information from the English Vocabulary Profile project. This categorises words and their senses into levels according to when learners are likely to be familiar with them. While everyone's learning path is different, it is a useful way of identifying how important the word is to learn and can also be used to determine how a learner is doing.

The six levels, A1-C2 have been designed to fit with the Common European Framework of Reference for Languages (CEFR), so are comparable with levels of learning across different levels. They are based on investigations into the language students actually produce in examination, using, among other things, the Cambridge Learner Corpus, and also a growing English Profile corpus of language students are using in schools.

So from the entry paint, you can see that the sense of making a picture is very well understood by learners, but the sense of covering a room in paint may not be used by learners until a little bit later.

How will you use this new information?

Python and Ruby Wrappers

You can now access Cambridge Dictionaries Online using Python and Ruby dictionary API wrappers, so now you can bring Cambridge's English language dictionaries into your Django and Rails site.

We've also brought all our wrappers into one API Developer Resources page on this site, along with samples. You can also find other things there like the DTD for the dictionary XML, and sample CSS for styling the dictionary entries.

You'll notice our wrappers are free and open-source, provided under the terms of a 3-clause BSD license, i.e. you can use, modify and share them, provided you keep the same license.

What's a wrapper? What's a binding?

A wrapper, sometimes known as a binding, is a piece of code written in a particular language to allow other code in that language to access some other service more easily. So for example, instead of writing code to fetch data from the URLs directly, the wrapper handles all the fiddly escaping, and provides the methods in a way which is appropriate for that language. You just need to invoke the object, pointing it to CDO and giving your access key, then you can call the methods of that object (e.g."british", "slate")).

What's a DTD?

The Document Type Definition or DTD is a simple machine-readable description of the elements and attributes that are allowed or required in an XML document. We have also written human-friendly comments throughout the DTD explaining the roles of the elements. We aim to keep all our entries in line with this DTD.

What else have we been doing?

If you haven't noticed already, we've been doing some documentation fixes and we made the API specification cleaner and much easier to read.

We've also fixed a small bug in the HTML output - you should have got an email about this earlier in the month if you're registered to get an API key.

Let us know in the comments box below if you've got any problems with the wrappers or the documentation.

Archived Articles

Designed and developed by Believe Creative