Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue: Allow to share contacts as vCards #5422

Open
Hocuri opened this issue Apr 3, 2024 · 6 comments
Open

Tracking issue: Allow to share contacts as vCards #5422

Hocuri opened this issue Apr 3, 2024 · 6 comments

Comments

@Hocuri
Copy link
Collaborator

Hocuri commented Apr 3, 2024

Tracking issue for sending a contact via Delta Chat, you probably know it from other messengers:

Motivation

Problem:

  • Alice has chats with Bob and Carol. Bob is using a Chatmail instance
  • She wants to introduce Bob and Carol, so she sends Carol's email address to Bob
  • Bob tries to send a message to Carol, but he can't because Chatmail doesn't allow unencrypted outgoing messages 😢

Solution:

  • Allow to share contacts as vCards, so that Alice can share Carol's contact to Bob. The vCard will then also contain Bob's public key. Carol will then use it as Bob's gossip key.

API design and implementation steps

From an a/v with @hpk42:

Rationale:

  • We thought about letting parse_vcard() directly create a hidden contact and then giving the contact id to the UI. Then opening a chat in which we received a vCard would have the sideeffect of creating a contact in the db, and if the user then later writes to this contact then the information we received would be reused - not a huge problem, but also not 100% nice.
  • For now, DC will only have support to show vCard's with a single contact. parse_vcard() will still return a list so that UIs later have the possibility to show multiple vCard's without another API change.

Questions and design choices

For each point, I specified which solution I prefer; if noone objects, we'll go with my preferred solutions

  1. What does core do do if it already has a gossip key the shared contact?
    • Probably just ignore the key from the vCard, because it may be a very old vCard
    • Alternatively, it could compare the timestamps
  2. How to make it possible to save the public key after the user clicked on a received vCard?
    • (Preferred) Solution 1: we need another API import_vcard(), which would then be the same as vCard-based address book import API #5202
      • What's not an argument against this point: "The UI may want to show an edit-contact dialog before actually saving it, giving the user to change the display name." But the UI can just first call import_vcard() and then show a dialog that edits the contact (in fact, it can just open the "edit contact" activity we already have)
    • Solution 2: parse_vcard() could directly save the peerstate, which would let parse_vcard() have a side effect.
      • Then, if I receive a vcard of Bob (without clicking on it), Bob changes his public key, someone else sends me another vcard of Bob with a new key and I click on it, my DC Core wrongfully uses the first shared key (because, as explained in the previous Open Question, core probably just ignoren the shared key from a vCard if it already has one). Although this seems to be only a veeeery niche problem
      • It violates the principle of least astonishment a little bit if parse_vcard() has side effects
      • On the upside, this solution helps keep the API surface smaller
  3. How should the contact be passed to create_vcard() - a contact id? A json struct with all the fields?
    • We definitely need to pass the contact id, because the UI doesn't have the public key.
    • Pro json struct: This would allow the UI to later show a dialog with options which contact details to share (phone number?)
    • Alternatively, just ignore the issue for now.
  4. DC will only ever have support to send a single contact in a vCard message, so @hpk42 and me figured that it's enough if create_vcard() takes a single contact. However, looking at vCard-based address book export API #5203, it may still be nice to take a list of contact ids so that we don't need another API for that.
    • Pro single contact: It prevents overeager UIs to send multiple contacts in one message (which other UIs can't view for the time being)
    • Pro list: Makes create_vcard() and parse_vcard() more symmetric
    • I personally prefer the list
  5. What does create_vcard() return? A filename, a vCard as text, set as draft?
    • Solution 1: create a file (and returns the filename) which the UI can then attach to the message
    • Solution 2: A vCard as text, the UI has to save it into a file and attach it
    • Solution 3: Set it as a draft. Seems overly inflexible - maybe the UI wants to directly send it, instead of setting it as a draft first. On the upside, it's the least work for the UIs.
    • I very slightly tend to prefer solution 1, but all solutions seem viable.
  6. What does parse_vcard() take? A filename, a vCard as text, a message id?
    • Probably just use the same decision as for create_vcard()
@Simon-Laux
Copy link
Member

Simon-Laux commented Apr 4, 2024

  1. ignoring sounds good, though maybe makes sense to attach some timestamp to those keys if the edge cases are important enough
  2. I prefer solution 1 - no side effects.
  3. contact id, as we don't have phone number in core, we could allow adding arbitrary keys by the ui, something like
fn create_vcard(contact_id: ContactId, additional_keys: Option<Hashmap<String, String>>)
  1. I agree seems to make sense, also because multiple contacts would still output one vcard file. (though then my api suggestion of 3. would need to be adjusted to Option<Hashmap<ContactId, Hashmap<String, String>>>)

@link2xt
Copy link
Collaborator

link2xt commented Apr 5, 2024

What does create_vcard() return? If it is a vCard as text, how do we attach it? Or should it set a draft?

@Hocuri
Copy link
Collaborator Author

Hocuri commented Apr 6, 2024

@Simon-Laux
1​. There often already is a timestamp in vCards, we could use this one.
3​. and 4. We currently don't have phone numbers, the idea was because we might add them at one point in the future. Though the API would probably rather take a list of which fields should be included.

@link2xt Probably it makes sense if create_vcard() creates a file (and returns the filename) which the UI can then attach to the message. I added it as open question 5.

@Hocuri
Copy link
Collaborator Author

Hocuri commented Apr 6, 2024

Should we use a crate to parse the vCards, or parse and create vCards manually?

1. Use ical

This is what I tried first. It looked very promising, so I actually started using it, but in the end, I do not think it can give us what we want.

ical is quite small, has almost no dependencies, has over 7000 downloads per month, and I reviewed it a bit before trying it out - the code looks clean, there is no unsafe, and I couldn't find anything shady.

For each contact, ical gives you a list of Property consisting of {name, params, value}. It doesn't parse the values, e.g. it doesn't turn the timestamp (20080424T195243Z) into a u64 or so.

Errors when parsing

ical seems to be picky when parsing.. It succeeded to parse a vCard exported from Thunderbird, the example from Wikipedia, and https://github.com/nuovo/vCard-parser/blob/master/Example.vcf, but it just returned an error when trying to parse https://github.com/nuovo/vCard-parser/blob/master/Example3.0.vcf or https://github.com/nuovo/vCard-parser/blob/master/Example2.1.vcf.

It's not (easily) possible to circumvent the errors when trying to parse https://github.com/nuovo/vCard-parser/blob/master/Example3.0.vcf or https://github.com/nuovo/vCard-parser/blob/master/Example2.1.vcf. Maybe that's fine, maybe these vCards are just violating the RFC.

Mistakes when parsing the KEY attribute

Also, it only succeeded to parse the 3.0 example for the KEY attribute at Wikipedia. For the 2.1 example (KEY;PGP;ENCODING=BASE64:[base64-data]), it returned a parameter with PGP;ENCODING as the key and BASE64 as the value. And for the 4.0 example (KEY:data:application/pgp-keys;base64,[base64-data] - the same as in the RFC), it returned data:application/pgp-keys;base64,[base64-data] as the key (just [base64-data] would be correct).

It's possible to handle KEY formats that are not supported by ical, but it's some effort, even slightly more than if ical just gave us the raw line with the KEY and we then parsed it manually.

It's also possible to just say that Delta Chat will only be able to parse KEYs in the 3.0 format.

Final remarks

ical has code to handle dquoted values, i.e. NAME:Foo="Some quoted text: that contains a colon":value. I didn't see any other code that handles special cases, but also, I didn't search a long time.

All in all, it looks like using ical would be possible if we accept the occasional parsing error, but not really provide value over parsing and creating vCards manually.

This is the code I used to try out ical

use anyhow::Result;
use once_cell::sync::Lazy;
use regex::Regex;

// TODO: Check if sanitizing is done correctly everywhere

#[derive(Debug, Default)]
pub struct Contact {
    addr: String,
    display_name: String,
    key: String,
    profile_photo: String,
    timestamp: u64,
}

pub fn parse_vcard(vcard: String) -> Result<Vec<Contact>> {
    let reader = ical::VcardParser::new(vcard.as_bytes());

    let mut contacts = Vec::new();
    for vcard_contact in reader {
        let vcard_contact = vcard_contact?; // TODO should just continue with the next contact
        let mut new_contact = Contact::default();

        let mut display_name = None;
        let mut addr = None;
        let mut key = None;
        let mut photo = None;
        let mut timestamp = None;

        for property in vcard_contact.properties {
            match &*property.name.to_lowercase() {
                "email" => addr = addr.or(property.value),
                "fn" => display_name = display_name.or(property.value),
                "key" => key = key.or(dbg!(property).value), // TODO hmmm, the ical crate can apparently only parse version 3.0
                "photo" => photo = photo.or(property.value),
                "rev" => {
                    timestamp = timestamp.or(property.value);
                }
                _ => {}
            }
        }

        let (display_name, addr) =
            sanitize_name_and_addr(&display_name.unwrap_or_default(), &addr.unwrap_or_default());
        new_contact.display_name = display_name;
        new_contact.addr = addr;

        if let Some(key) = key {
            if let Some(base64_key) = key
                .strip_prefix("PGP;ENCODING=BASE64:")
                .or(key.strip_prefix("TYPE=PGP;ENCODING=b:"))
                .or(key.strip_prefix("data:application/pgp-keys;base64,"))
            {
                new_contact.key = base64_key.to_string();
            }
        }

        if let Some(photo) = photo {
            if let Some(base64_photo) = photo
                .strip_prefix("PGP;ENCODING=BASE64:")
                .or(photo.strip_prefix("TYPE=PGP;ENCODING=b:"))
                .or(photo.strip_prefix("data:application/pgp-keys;base64,"))
            {}
        }

        contacts.push(new_contact);
    }

    Ok(contacts)
}

2. Use vcard

vcard has more lines of code and way more dependencies than ical. This is because it does more work, e.g. you can give it a time and it will correctly format the timestamp into the vCard.

It only has 130 downloads per month, likely because it's not finished; right now it can only create vCards, not parse them.

Since parsing is the harder task than creating vcards for us, vcard is not helpful for us in any way.

3. Use a different library

Crates.io has 94 results for vcard, lib.rs has 16 crates with the vcard tag. All of them have very few or almost no downloads, which makes it likely that they contain bugs that were just not found because no one actually used these libraries. Still, maybe there is some very good library in there that I just didn't look at.

3. Parse and create vCards manually

If we sacrifice some compatibility, like:

  • demanding that an PGP key and avatar is in a very specific format (and otherwise just ignoring the key/avatar)
  • Not supporting quotes in keys, like NAME:Foo="Some quoted text: that contains a colon":value (which I didn't see anywhere "in the wild")

then it is actually a really easy task, and even having more robust parsing would probably still be easier than trying to work around icals problems I described above.

The simple code to parse a vcard manually (which is a bit longer than the code I used for `ical` above, but also, it's doing a bit more work)

#[derive(Debug)]
/// A Contact, as represented in a VCard.
pub struct VcardContact {
    /// The email address, vcard property `email`
    pub addr: String,
    /// The contact's display name, vcard property `fn`
    pub display_name: String,
    /// The contact's public PGP key, vcard property `key`
    pub key: Option<String>,
    /// The contact's profile photo (=avatar), vcard property `photo`
    pub profile_photo: Option<String>,
    /// The timestamp when the vcard was created / last updated, vcard property `rev`
    pub timestamp: Result<u64>,
}

pub fn parse_vcard(vcard: String) -> Result<Vec<VcardContact>> {
    fn remove_prefix<'a>(s: &'a str, prefix: &str) -> Option<&'a str> {
        let start_of_s = s.get(..prefix.len())?;

        if start_of_s.eq_ignore_ascii_case(prefix) {
            s.get(prefix.len()..)
        } else {
            None
        }
    }
    fn vcard_property<'a>(s: &'a str, property: &str) -> Option<&'a str> {
        let remainder = remove_prefix(s, property)?;

        // TODO this doesn't handle the case where there are quotes around a colon
        let (_params, value) = remainder.split_once(':')?;
        Some(value)
    }
    fn parse_datetime(datetime: Option<&str>) -> Result<u64> {
        let datetime = datetime.context("No timestamp in vcard")?;
        let datetime = DateTime::parse_from_rfc3339(datetime)?;
        let timestamp = datetime.timestamp().try_into()?;
        Ok(timestamp)
    }

    let mut lines = vcard.lines().peekable();
    let mut contacts = Vec::new();

    while lines.peek().is_some() {
        // Skip to the start of the vcard:
        for line in lines.by_ref() {
            if line.eq_ignore_ascii_case("BEGIN:VCARD") {
                break;
            }
        }

        let mut display_name = "";
        let mut addr = "";
        let mut key = None;
        let mut photo = None;
        let mut datetime = None;

        for line in lines.by_ref() {
            if let Some(email) = vcard_property(line, "email") {
                addr = email;
            } else if let Some(name) = vcard_property(line, "fn") {
                display_name = name;
            } else if let Some(k) = remove_prefix(line, "KEY;PGP;ENCODING=BASE64:")
                .or(remove_prefix(line, "KEY;TYPE=PGP;ENCODING=b:"))
                .or(remove_prefix(line, "KEY:data:application/pgp-keys;base64,"))
            {
                key = Some(key.unwrap_or(k));
            } else if let Some(p) = remove_prefix(line, "PHOTO;JPEG;ENCODING=BASE64:")
                .or(remove_prefix(line, "PHOTO;TYPE=JPEG;ENCODING=b:"))
                .or(remove_prefix(line, "PHOTO;ENCODING=BASE64;TYPE=JPEG:"))
            {
                photo = Some(photo.unwrap_or(p));
            } else if let Some(rev) = vcard_property(line, "rev") {
                datetime = Some(datetime.unwrap_or(rev));
            } else if line.eq_ignore_ascii_case("END:VCARD") {
                break;
            }
        }

        let (display_name, addr) = sanitize_name_and_addr(display_name, addr);

        contacts.push(VcardContact {
            display_name,
            addr,
            key: key.map(|s| s.to_string()),
            profile_photo: photo.map(|s| s.to_string()),
            timestamp: parse_datetime(datetime),
        });
    }

    Ok(contacts)
}

In summary, I think not using an external crate and doing all the parsing ourselves is the best option.

Hocuri added a commit that referenced this issue Apr 8, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add https:////github.com//issues/5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 8, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add https:////github.com//issues/5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 8, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add #5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 8, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add #5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 8, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add #5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 16, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add #5422 into
the new crate.
Hocuri added a commit that referenced this issue Apr 16, 2024
I would like to implement
#5422 in its own
crate, but it will depend on some functions that are in the `deltachat`
crate.

So, this PR extracts these functions into its own crate so that I can
add #5422 into
the new crate.
@link2xt
Copy link
Collaborator

link2xt commented Apr 20, 2024

Problem with rolling out our own parsing that can only parse a small subset of vcards created by Delta Chat itself is that we cannot use it to import .vcf generated by android contacts application for #5202

If ical has problems with PGP keys etc. maybe it is possible to contribute fixes upstream?

@Hocuri
Copy link
Collaborator Author

Hocuri commented Apr 24, 2024

Problem with rolling out our own parsing that can only parse a small subset of vcards created by Delta Chat itself is that we cannot use it to import .vcf generated by android contacts application for #5202

I just tried it out, and it's actually the other way round: I exported a .vcf file generated by my android contacts application and tried to parse it with both my hand-written "parser" (not sure if you can call it a parser, it's really simple, just searching through the file for the lines we need) and my code using ical.

My own "parser" successfully extracted the name and email address, while ical failed to do so.

Details

Code to test:

    #[test]
    fn test_android_contact_export() {
        let contacts = parse_vcard(
            "BEGIN:VCARD
VERSION:2.1
N:;Bob;;;
FN:Bob
TEL;CELL:+1-234-567-890
EMAIL;HOME:bob@example.org
END:VCARD
BEGIN:VCARD
VERSION:2.1
N:;Alice;;;
FN:Alice
EMAIL;HOME:alice@example.org
END:VCARD
"
            .to_string(),
        )
        .unwrap();

        assert_eq!(contacts[0].addr, "bob@example.org".to_string());
        assert_eq!(contacts[0].display_name, "Bob".to_string());
        assert_eq!(contacts[0].key, "".to_string());
        assert_eq!(contacts[0].profile_photo, "".to_string());

        assert_eq!(contacts[1].addr, "alice@example.org".to_string());
        assert_eq!(contacts[1].display_name, "Alice".to_string());
        assert_eq!(contacts[1].key, "".to_string());
        assert_eq!(contacts[1].profile_photo, "".to_string());

        assert_eq!(contacts.len(), 2);
    }

With the code from #5482, it succeeds, with the code using ical it gives:

thread 'tests::test_android_contact_export' panicked at deltachat-contact-tools/src/lib.rs:439:10:
called `Result::unwrap()` on an `Err` value: property error: Line 5: Missing a "=" delimiter.

(it doesn't like the TEL;CELL:+1-234-567-890 because it expects sth like TEL;type=CELL:+1-234-567-890).

I pushed it to hoc/try-out-ical-2 in case you want to try it out.

If ical has problems with PGP keys etc. maybe it is possible to contribute fixes upstream?

This seems like a huge task, because there were multiple problems all of which we would need to fix, and if we want to contribute upstream then we don't only have to grep for the lines that are interesting for us, but actually parse it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants