Project

General

Profile

Actions

action #743

closed

find a tool to convert mediawiki to pdf

Added by lnussel over 10 years ago. Updated over 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Development
Target version:
Start date:
2013-09-02
Due date:
2013-10-25
% Done:

0%

Estimated time:

Description

Jos needs an easy way to get from the features page in the opensuse wiki to a nice pdf document for the press.
It should be possible to retrieve the raw mediawiki source and convert to a markup language that is understood by office programs.

https://en.opensuse.org/index.php?title=openSUSE:Major_features&action=raw


Files

test.html (91.1 KB) test.html asciidoc output _miska_, 2013-09-03 06:57
test.odt (37.7 KB) test.odt Libre Office output _miska_, 2013-09-03 06:57
Feature Guide.pdf (501 KB) Feature Guide.pdf That's what it should be... Anonymous, 2013-09-06 13:46
copy-paste.odt (1.13 MB) copy-paste.odt plain copy-paste into LO. Beats the script output... Anonymous, 2013-09-06 13:46
index.pdf (210 KB) index.pdf deplate generated file (after manual tweaking) alarrosa, 2013-10-22 16:33

Updated by _miska_ over 10 years ago

Results of my first tests with pandoc:

  • pictures are screwed up
  • I prefer asciidoc output over the LibreOffice ** with little bit of sed, we should be able to fix that one...
Actions #2

Updated by toscalix over 10 years ago

  • Status changed from New to In Progress

Michal, when you finish this task, assign it to Jos so he tries it.

Actions #3

Updated by _miska_ over 10 years ago

  • Status changed from In Progress to Feedback
  • Assignee changed from _miska_ to Anonymous

Does either of it looks at least a little bit helpful? Or should I continue trying? Afaik, pictures will have to be redone manually anyway, but looks like LO stuff uses styles :-/

Updated by Anonymous over 10 years ago

If I have to do the pictures by hand, I gain little over copy-pasting it all by hand into LibreOffice... At least then I get the images, even though I still have to re-arrange many of them and clean up lots of the text/formatting. See the plain copy-paste in attached odt, imho better than what you created (but still a lot of work to get to attachment two, the final pdf).

If this takes more than an hour now it's not worth it, however - within 1-2 hours I can turn the wiki into this document, so I'll do it by hand. Unless you think the script can handle images too I think we should just close this.

Actions #5

Updated by _miska_ over 10 years ago

  • Status changed from Feedback to Rejected

Images formating/allignement is difficult to handle automatically.

Actions #6

Updated by lnussel over 10 years ago

  • Due date set to 2013-10-07
  • Status changed from Rejected to New
  • Target version changed from 13.1 Beta 1 to 13.1 RC1
Actions #7

Updated by _miska_ over 10 years ago

No update, haven't tried anything new, still think that if we need it to look pretty with pictures formating, manual interaction will be needed.

Actions #8

Updated by lnussel over 10 years ago

  • Due date changed from 2013-10-07 to 2013-10-16
  • Target version changed from 13.1 RC1 to 13.1 RC2

Maybe those tools can do it. Worth a try IMO.

Actions #9

Updated by lnussel over 10 years ago

Note this is not just useful for this time's release, it might be useful for documentation in general. Some things make sense to have in the wiki as well as pdf.

Actions #10

Updated by lnussel over 10 years ago

  • Due date changed from 2013-10-16 to 2013-10-25
  • Assignee changed from _miska_ to alarrosa
Actions #11

Updated by alarrosa over 10 years ago

After installing haskell and a few other packages to try wb2pdf from sources, there are some dependencies that are not available in suse. The author provides only windows and ubuntu binaries, and recommends that if you want to try wb2pdf in another distro, you should install virtualbox with an ubuntu guest ... so I don't think it's worth the time installing it just to try it.
I'll try now javalatex and deplate ( http://deplate.sourceforge.net/ )

Actions #12

Updated by alarrosa over 10 years ago

The latex output of deplate gave errors when compiling it to a pdf, but after some manual tweaking adding the following lines to the header,

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\DeclareUnicodeCharacter{00A0}{~}

I managed to make it generate a pdf. The awful attached pdf, to be more concrete.

Actions #13

Updated by alarrosa over 10 years ago

I couldn't manage to get a correct latex document generated with javaLatex even after spending some time trying to manually fix it, so I think I'll close this issue and if someone thinks it's worth to spend more time to find or create a tool to convert mediawiki format to pdf, just reopen it or create a new one.

Actions #14

Updated by toscalix over 10 years ago

  • Status changed from New to Rejected

Enough time spent already.

Actions #15

Updated by toscalix over 10 years ago

  • Status changed from Rejected to Closed
Actions

Also available in: Atom PDF