Feature request: Merging phrasal verbs

One feature I would love to see implemented on Readlang is the ability to merge words that are separated by other words for context-aware translation as well as regular translation. Consider the following example from a German text:

“Hast du nicht? Dann hol das unbedingt nach.”

In this example, hol and nach actually represent a single semantic unit - the verb nachholen - which means “to catch up/to make up.”

There is one crucial difference between German and English phrasal verbs: English phrasal verbs often stay together (“Then be sure to catch up”), while German phrasal verbs frequently appear in separate parts of the sentence (“Dann hol das unbedingt nach”).

Because of this, translating these German verbs is a bit jarring in virtually all of the apps I have tried. One website that attempts to rectify this problem is lemmatize.com. It attempts to do so by automatically detecting and selecting semantic units, but I’ve found it not to work as expected most of the time.

Fortunately, learners of languages like German don’t need this automatic detection at all. We can identify these semantic units ourselves. What we need is an easy way to tell a translation app/LLM what we want translated/explained. And Readlang already seems to have the necessary infrastructure for it.

There already exist two options for merging: “Don’t Merge” and “Merge Phrases”. I suggest adding a third one, something like “Always Merge”, which will allow the user to merge selected words even if they are apart. For instance, in the example sentence I provided above, if I click on hol and then on nach, the app will translate hol nach if the “Always Merge” option is selected, instead of first translating hol and then nach separately, as it does with the “Merge Phrases” option currently.

3 Likes

Thanks for the suggestion. Handling of phrasal verbs is a long standing issue, but it seems only to be an issue for German learners so it never feels that worth it to address since German learners are a small fraction of the total. That said, it’s come up often enough that I’m tempted to think more about it.

5 Likes

I promise you if you implement this feature you will win over all German learners!

5 Likes

It also comes up in Slavic languages with reflexive verbs; some are compositional in meaning (Czech “umýt se” to wash oneself), many are not (Czech “radovat se” to be joyful).

Moreover, English phrasal verbs also consist of two elements that may be separated by other words in the sentence (“could you pick them up?”).

People also occasionally insert stuff into idioms one might want to learn (“…still grateful that Sport England footed a £165m bill for the 2002 Commonwealth Games”; foot a/the bill (for sth) – be responsible for paying the cost of sth). I suppose this kind of word play happens in all languages.

4 Likes

Your observations regarding reflexive verbs apply to German too. For example:

Ich habe damals versprochen (I promised back then).
Ich habe mich damals versprochen (I misspoke back then).

1 Like

This would also be a great feature for separable verbs in Dutch.

3 Likes

I’m voting up for this… me and my wife would highly benefit from it as German language learners.

4 Likes

This feature really is needed for Dutch language as well. Seperable verbs are an essential part of Dutch language too as it shares the majority of German vocabulary and grammar. I’ve been using readlang to improve my Dutch vocabulary for a long time but it’s really painful when it comes to seperable verbs.

3 Likes

It’s also an issue for Swedish learners (and I suppose Norwegian and Danish as well), although I understand we’re a minority. What I do right now is translating the whole phrase and then I either don’t bother to save those translations or I save only the “main” verb or word and count on my brain to be able to search for the missing part whenever I encounter it. Not ideal, but bearable.

It’s quite the challenge that is readily apparent for German. I see AI as a hopeful tool to analyze enough of the context to put two and two together and find the connection. “abholen” is a great example in German. “Ich hole Dich ab.”

The downside of AI is that it infers translation when it’s not helpful – when you want to isolate the word. But Readlang reader’s integration with a dictionary allows the experiened language learner to compensate for that.

1 Like

I think this would benefit way more than just German learners when you consider other things than verbs. For example, in French I keep running across words that are spaced apart in a sentence but form one semantic unit, like “autant… que…”

1 Like

Of course, in Dutch too, not just German. And with more languages if they use expressions of which words may get scattered in the sentence.

But the thing is - READLANG is profiling itself as a “helping and understanding aid for readers”, and “taking expressions together”, and reducing them to “their base form”, is rather about “teaching a language” which is a step further, and maybe not the ultimate aim of READLANG.

However, if you select a phrase of several words or even a complete sentence, it should not be too difficult to ask AI via the “explain” button to produce a vocab list with nouns, verbs, adjectives, adverbs, prepositions, other … and expressions, and all in their base forms.

So the student can learn from the “explain” information. There would be no need for the student to “find expression or split verb words” in the sentence, consecutive or scattered, and have these explained.

From the vocab list (base forms) and expressions in the “explain session” it would then be handy to click on these again so the flashcard is made. And an “export” feature, as well as “import” will make every app popular, as people feel free to use the tools, and not just READLANG, tools and apps that suit their needs.

But the non-base forms, the inflected form as in the sentence itself, are equally important, as it illustrates a use-case for the base words. If a verb is splittable, then the sentence illustrates how and where to split.

3 Likes

The question is if generative AI in the current state of development would be able to do that. My experience is that there is “no there there” in terms of grammatical understanding. If I ask chatgpt to make a list of frequently used separable verbs in Dutch, it then comes up with random lists of verbs some of which are separable and others not…Ask it to make sentences with them and it makes grammatical errors.

It would seem to me that asking these bullshit text generators to meaningfully and correctly analyze a sentence would be a bridge too far.

I think this is the way! I want to see how separable verbs are used in writing, and have the infinitive verb form on the flash card as a reference. Best of both worlds then.

1 Like

I didn’t want to start a discussion about AI here in Steve’s suggestions forum.
So I posted my Deepseek R.3 test on “separatable Dutch verbs verbs” here:

Feel free to come and visit us. Why not register also and further discuss, and share your view on AI and your experience with these tools..

NOTE: I think it’s up to Steve to find out how far he wants to go in “teaching” rather than basic/advanced “helping people to read and understand things” in the form they appear.

Anything more, like base forms, unconjugated verbs, expressions, takng words together .. could be seen as teaching, not “bare helping”

Remains that, if the cards we make for us to “learn” then it may be advisable that we should be able to “CTRL-click” several non-consecutive words to appear as the “phrase to learn” on the card, with maybe these […] for the skipped parts.

2 Likes