Timestamps are in UTC.
00:06:18 [Logomachist] | Logomachist has joined #swig |
---|---|
00:12:52 [timbl] | timbl has quit |
00:12:58 [apspoliveira_] | apspoliveira_ has quit |
00:17:05 [hg_] | hg_ has quit |
00:19:11 [karlcow] | karlcow has quit |
00:22:41 [cygri] | cygri has quit |
00:31:37 [Phae] | Phae has joined #swig |
00:33:15 [jsoltren] | jsoltren has quit |
00:34:17 [nathany] | nathany has quit |
00:36:06 [LeeF] | LeeF has joined #swig |
00:58:05 [Phae] | Phae has quit |
01:02:12 [shepazutoo] | shepazutoo has joined #swig |
01:07:26 [nathany] | nathany has joined #swig |
01:18:06 [chimezie] | chimezie has quit |
01:18:08 [LotR] | LotR has quit |
01:18:08 [gsnedders] | gsnedders has quit |
01:18:08 [CaptSolo] | CaptSolo has quit |
01:18:08 [simeoni] | simeoni has quit |
01:18:10 [shepazu] | shepazu has quit |
01:18:22 [sessi] | sessi has joined #swig |
01:18:22 [LotR] | LotR has joined #swig |
01:18:22 [gsnedders] | gsnedders has joined #swig |
01:18:22 [CaptSolo] | CaptSolo has joined #swig |
01:18:22 [simeoni] | simeoni has joined #swig |
01:18:23 [CaptSolo] | CaptSolo has quit |
01:18:27 [CS_] | CS_ has joined #swig |
01:18:46 [simeoni] | simeoni has quit |
01:19:26 [karlcow] | karlcow has joined #swig |
01:24:02 [jaresty] | jaresty has joined #swig |
01:26:46 [nathany] | nathany has quit |
01:31:55 [Arnia] | Arnia has joined #swig |
01:38:52 [shellac] | shellac has quit |
01:39:12 [harbulot] | harbulot has quit |
01:47:17 [melvster] | melvster has quit |
01:53:34 [melvster] | melvster has joined #swig |
01:53:55 [melvster] | melvster has left #swig |
02:03:05 [chimezie] | chimezie has joined #swig |
02:15:46 [Tristan] | Tristan has quit |
02:16:14 [Tristan] | Tristan has joined #swig |
02:16:20 [allisterb] | allisterb has joined #swig |
02:18:51 [MacTed] | MacTed has joined #swig |
02:26:05 [coreyleong] | coreyleong has joined #swig |
02:43:43 [justben] | justben has joined #swig |
03:03:25 [chimezie] | chimezie has quit |
03:13:07 [eikeon] | eikeon has quit |
03:14:34 [chimezie] | chimezie has joined #swig |
03:17:24 [eikeon] | eikeon has joined #swig |
03:19:36 [Tristan] | Tristan has quit |
03:21:56 [Tristan] | Tristan has joined #swig |
03:22:50 [Tristan] | Tristan has quit |
03:23:17 [Tristan] | Tristan has joined #swig |
03:24:11 [Tristan] | Tristan has quit |
03:24:32 [Tristan] | Tristan has joined #swig |
03:25:26 [Tristan] | Tristan has quit |
03:26:45 [Tristan] | Tristan has joined #swig |
03:29:30 [coreyleong] | coreyleong has quit |
03:33:13 [Tristan] | Tristan has quit |
03:33:29 [karlcow] | karlcow has quit |
03:33:33 [Tristan] | Tristan has joined #swig |
03:40:28 [drewp] | drewp has quit |
03:41:51 [drewp] | drewp has joined #swig |
03:53:17 [nathany] | nathany has joined #swig |
04:02:57 [justben] | justben has quit |
04:04:15 [logger] | logger has quit |
04:06:12 [karlcow] | karlcow has joined #swig |
04:08:38 [dulanov] | dulanov has joined #swig |
04:09:08 [activefx] | activefx has joined #swig |
04:29:14 [chimezie_] | chimezie_ has joined #swig |
04:30:40 [chimezie] | chimezie has quit |
04:32:16 [chimezie_] | chimezie_ is now known as chimezie |
04:37:13 [mun] | mun has quit |
04:38:12 [mun] | mun has joined #swig |
04:39:15 [dmitry_ulanov] | dmitry_ulanov has joined #swig |
04:44:27 [activefx] | activefx has quit |
04:47:55 [dulanov] | dulanov has quit |
04:53:07 [MacTed] | MacTed has quit |
04:54:26 [jansc] | jansc has joined #swig |
04:57:48 [jansc] | jansc has quit |
05:15:32 [cbichis] | cbichis has joined #swig |
05:16:09 [cbichis] | cbichis has left #swig |
05:30:49 [twanj] | twanj has quit |
05:31:41 [oshani] | oshani has quit |
05:57:59 [mhausenblas] | mhausenblas has joined #swig |
06:00:59 [dmitry_ulanov] | dmitry_ulanov has quit |
06:01:28 [mhausenblas] | good morning Web of Data |
06:05:20 [kwijibo] | kwijibo has joined #swig |
06:06:26 [mhausenblas] | oh oh - kwijibo |
06:06:39 [kwijibo] | hi |
06:06:53 [mhausenblas] | can't believe it, man what are you doin up that early? |
06:06:56 [kwijibo] | early for u mhausenblas ? |
06:07:02 [mhausenblas] | :) |
06:07:20 [kwijibo] | i'm still in austria, so it's only 7 fro me |
06:07:46 [kwijibo] | don't you ever sleep man? |
06:07:47 [mhausenblas] | ah |
06:08:05 [mhausenblas] | yep. from around 11:00pm to 6:00am |
06:08:18 [mhausenblas] | which is more than sufficient |
06:08:36 [mhausenblas] | (having children one learns to appreciate every hour sleep ;) |
06:09:02 [mhausenblas] | logger, pointer |
06:09:17 [mhausenblas] | oh, to early for the bots? |
06:10:03 [keithA] | keithA has joined #swig |
06:10:26 [kwijibo] | kwijibo has quit |
06:10:35 [mhausenblas] | logger-sioc-rdfa, pointer |
06:10:35 [mhausenblas] | See http://buzzword.org.uk/2009/logger-sioc-rdfa/swig/2009-03-11.html#T06-10-35 |
06:11:11 [mhausenblas] | melvster, re http://esw.w3.org/topic/PushBackDataToLegacySourcesAuthentication - great work |
06:12:18 [keithA] | keithA has quit |
06:12:45 [mhausenblas] | What I don't get is, though, the distinction between the RDF source and the non RDF-source |
06:13:38 [mhausenblas] | Our setup is: we have is a write-wrapper around a non-RDF-source such as Twitter |
06:14:09 [mhausenblas] | precisely as uriburner can be understood as a read-wrapper around Twitter |
06:15:08 [mhausenblas] | now, the user navigating in the 'semantic space' (i.e., pulling, viewing, filtering *RDF* data from different RDF data sources) want to edit |
06:15:54 [mhausenblas] | currently, the poor chap needs to copy & paste from Tabulator or whatever he uses, log into the origin non-RDF source and change things there |
06:16:02 [mhausenblas] | and pushback will shortcut this |
06:16:16 [mhausenblas] | where is the RDF source is in this setup? :) |
06:44:02 [mhausenblas] | so, we could add some stuff about how we gonna handle this in pushback as well (sort of plug-in?) |
06:44:18 [mhausenblas] | the RDForm is configurable per se |
06:44:37 [mhausenblas] | we can specify a default auth mechanism, people may choose others |
06:45:09 [mhausenblas] | today we will have a RDForm container that helps understanding the interaction part (Aftab is on this) |
06:45:58 [mhausenblas] | last things: I added a link to your auth page from the home page of pushback |
06:47:24 [mhausenblas] | phenny, tell melvster re pushback auth - thanks! and see http://buzzword.org.uk/2009/logger-sioc-rdfa/swig/2009-03-11.html#T06-10-35 for comments |
06:47:24 [phenny] | mhausenblas: I'll pass that on when melvster is around. |
06:55:29 [mhausenblas] | dajobe: chump bot not around? |
07:07:22 [cbichis] | cbichis has joined #swig |
07:09:32 [sYskk] | sYskk has quit |
07:11:52 [cbichis] | cbichis has left #swig |
07:14:53 [presbrey] | presbrey has quit |
07:20:45 [jansc] | jansc has joined #swig |
07:28:22 [kwijibo] | kwijibo has joined #swig |
07:33:53 [Wikier] | Wikier has joined #swig |
07:34:57 [pauld] | pauld has joined #swig |
07:37:18 [bengee] | bengee has joined #swig |
07:39:17 [pesla] | pesla has joined #swig |
07:44:27 [kwijibo] | kwijibo has quit |
08:02:54 [timbl_] | timbl_ has joined #swig |
08:04:32 [Logomachist] | Logomachist has left #swig |
08:05:11 [pauld] | pauld has quit |
08:10:46 [Paul_Miller] | Paul_Miller has joined #swig |
08:12:45 [tobyink] | tobyink has joined #swig |
08:13:17 [tobyink] | tobyink has left #swig |
08:17:01 [FabGandon] | FabGandon has joined #swig |
08:17:08 [gromgull] | gromgull has joined #swig |
08:18:49 [nathany] | nathany has quit |
08:19:42 [reto] | reto has joined #swig |
08:24:45 [IvanHerman] | IvanHerman has joined #swig |
08:33:15 [leobard] | leobard has joined #swig |
08:37:20 [hg] | hg has joined #swig |
08:46:47 [duryodhan] | duryodhan has joined #swig |
08:47:38 [reto] | reto has quit |
08:48:03 [duryodhan] | hi .. total noob to semantic web/rdf etc. .... I wanted to use dbpedia .. any ideas/help ... for e.g, I have a list of articles about whom I want the infobox content -- using dbpedia's infobox crawl how do I get it ? |
08:49:33 [mmmmmrob] | mmmmmrob has joined #swig |
08:55:52 [reto] | reto has joined #swig |
08:59:59 [reto] | reto has quit |
09:05:12 [bengee] | you mean, how to get from wikipedia URLs to the DBPedia ones? |
09:05:26 [stetho] | stetho has joined #swig |
09:06:19 [tobyink] | tobyink has joined #swig |
09:08:36 [KjetilK] | duryodhan: basically, if you have a URL http://en.wikipedia.org/wiki/Resource_Description_Framework then the URI of the abstract thing which is RDF is http://dbpedia.org/resource/Resource_Description_Framework will give you a description of it |
09:08:53 [pauld] | pauld has joined #swig |
09:09:23 [KjetilK] | so if you GET that URI, you will first get a 303 redirect which basically means "actually, that's an abstract thing, but this is the description of it" |
09:09:29 [duryodhan] | yeah but I have a huge list for which I need to get data .. so I downloaded the infobox dump directly |
09:09:37 [duryodhan] | so I don't want to use the web |
09:09:43 [KjetilK] | oh, ok |
09:10:00 [KjetilK] | * KjetilK hasn't worked with the infobox dump |
09:11:03 [duryodhan] | let me give an example .. |
09:11:28 [duryodhan] | lets say I want to get the headquarter city and the url of all companies listed on nasdaq |
09:11:38 [duryodhan] | to get their wikipedia pages is easy .. http://en.wikipedia.org/wiki/Category:Companies_listed_on_NASDAQ |
09:12:18 [duryodhan] | but to get the city (or revenue or URL or whatever ) from the infobox on each page is a pain to do for me ... |
09:12:25 [duryodhan] | so I thought I would use the infobox dump |
09:12:31 [duryodhan] | or is there a simpler way to do it ? |
09:13:05 [duryodhan] | now what I can do is for each wikipedia page I want info about, I do a grep in the dump and use the output ... but I am _sure_ there is a better way |
09:13:21 [libby] | libby has joined #swig |
09:13:28 [KjetilK] | hmmm, yeah |
09:14:07 [shellac] | shellac has joined #swig |
09:14:26 [KjetilK] | hmpf, things started to get slow for me here, let me see |
09:14:45 [iand] | iand has joined #swig |
09:15:25 [duryodhan] | kjetilk ? |
09:15:28 [KjetilK] | * KjetilK needs newer hardware :-) |
09:15:36 [reto] | reto has joined #swig |
09:15:43 [duryodhan] | aah .. I have lots of hardware .. what do you want to do ?? :D |
09:16:56 [KjetilK] | basically, I think a SPARQL query could get you what you want, I just don't remember the properties you need to look for |
09:17:12 [KjetilK] | * KjetilK kills firefox and starts over |
09:17:50 [duryodhan] | what are properties ? dbpprop:website ? |
09:18:00 [KjetilK] | right |
09:19:11 [KjetilK] | so, you should be able to say stuff like SELECT * WHERE { ?company dbpprop:website ?website ; dbo:category <http://dbpedia.org/resources/Category:Companies_listed_on_NASDAQ> } |
09:20:52 [duryodhan] | that gave an error. . I changed it to dbpedia:category but that gave nothing as output too |
09:21:19 [KjetilK] | yeah, that was just an example off my head without checking |
09:22:01 [duryodhan] | yeah .. but whats wrong with SELECT * WHERE { ?company dbpprop:website ?website ; dbpedia:category <http://dbpedia.org/resources/Category:Companies_listed_on_NASDAQ> } ? |
09:22:38 [KjetilK] | hmmmm, so actually, there is a bit of reasoning involved, since the company isn't necessarily annotated with that it is listed on NASDAQ |
09:24:02 [duryodhan] | eh ? |
09:24:03 [KjetilK] | lets see if there is an easier way to do it |
09:24:16 [timbl_] | timbl_ has quit |
09:24:22 [KjetilK] | basically, you need the YAGO ontology to find the stuff on DBPedia |
09:24:36 [duryodhan] | I think skos:subject dbpedia:category Companies list on NASDAQ ? |
09:25:44 [KjetilK] | right |
09:25:53 [KjetilK] | you know the dataset better than me :-) |
09:25:53 [leobard] | leobard has quit |
09:26:00 [duryodhan] | but how do I search for that ? |
09:26:05 [KjetilK] | so this is the page with it http://dbpedia.org/class/yago/CompaniesListedOnNASDAQ |
09:26:07 [duryodhan] | I don't know SPARQL |
09:26:43 [duryodhan] | yeah |
09:28:27 [duryodhan] | but say now I want to do it for another category .. ? |
09:28:35 [duryodhan] | all don't seem to have yago pages .. |
09:28:45 [duryodhan] | if they have a mention in the ontology ... for e.g http://dbpedia.org/ontology/EducationalInstitution |
09:28:51 [duryodhan] | how do I work with that ? |
09:29:32 [duryodhan] | so if I want urls of all educationalInstitutions ? |
09:30:04 [tobyink] | duryodhan: http://sparql.pastebin.com/m74ac21fc |
09:30:15 [tobyink] | Run it at http://dbpedia.org/sparql |
09:30:53 [KjetilK] | it is good to have people like tobyink around who actually knows the dataset :-) |
09:31:33 [KjetilK] | * KjetilK got confused by the first example he found wasn't actually annotated with the NASDAQ URI, the data isn't perfect |
09:31:41 [tobyink] | I only really started looking at it properly a couple of weeks ago when I was trying to find a way of listing cheeses. |
09:31:51 [KjetilK] | :-) |
09:32:16 [tobyink] | Generally I find the best way of querying it is to use the presence of infoboxes as a proxy to infer rdf:type. |
09:32:18 [KjetilK] | we've been using some infobox data at work, but I didn't do the actual work |
09:33:05 [KjetilK] | and we should probably have downloaded the infobox dump rather than use SPARQL, but I wasn't aware of the infobox dump before now |
09:33:38 [tobyink] | e.g. instead of looking for an rdf:type of <http://sw.opencyc.org/2008/06/10/concept/Mx4rwOD7mJwpEbGdrcN5Y29ycA>, I would look for <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_cheese> |
09:34:35 [duryodhan] | tobyink: but then you are assuming that all cheese pages use infobox_cheese and not some other infobox |
09:34:44 [tobyink] | They do. |
09:35:00 [tobyink] | Or at least, they do more reliably than any other way of classifying them. |
09:35:08 [KjetilK] | and if they don't, there isn't much you can do about it |
09:35:29 [tobyink] | It's a big, visible part of the page, so editors tend to notice its presence/absence. |
09:35:53 [KjetilK] | the data isn't better than the effort gone into maintaining Wikipedia, but then Wikipedia is pretty good |
09:36:21 [KjetilK] | * KjetilK trots off to focus on $work |
09:37:57 [duryodhan] | tobyink: yeah .. but what I would do is start off from some page like .. Category: Type of Cheese |
09:38:02 [duryodhan] | and then go to all the pages in it |
09:38:23 [duryodhan] | http://en.wikipedia.org/wiki/Category:Cheeses |
09:38:24 [dc_swig] | A: http://en.wikipedia.org/wiki/Category:Cheeses from duryodhan |
09:38:43 [duryodhan] | start with that .. get all the names of pages .. and search in the infobox dump for them |
09:39:24 [duryodhan] | * duryodhan was not only ignoring work .. he forgot about a meeting |
09:39:36 [duryodhan] | got to run ... but please ping any thing you can think of |
09:40:27 [KjetilK] | :-) |
09:43:21 [jaimico] | jaimico has joined #swig |
09:45:25 [tobyink] | duryodhan: URLs of universities - http://sparql.pastebin.com/m5c438e6c |
09:49:41 [mischat] | mischat has joined #swig |
09:51:21 [cerealtom] | cerealtom has joined #swig |
09:51:29 [cerealtom] | cerealtom has left #swig |
09:51:55 [mmmmmrob] | mmmmmrob has quit |
09:56:27 [swh] | swh has joined #swig |
09:58:32 [ldodds] | ldodds has joined #swig |
10:02:00 [BenO] | BenO has joined #swig |
10:03:17 [jaimico] | jaimico has quit |
10:07:39 [Arnia] | Arnia has quit |
10:14:33 [ephemerian] | ephemerian has joined #swig |
10:21:17 [duryodhan] | tobyink: thats great .. it will help a lot .. but if I already have the name of the wikipedia page .. can't I do something with the infobox dump directly (cos from my experience I know that not all pages have infobox_university .. some may have infobox_name_of_univ |
10:21:41 [duryodhan] | just a nagging concern .. what you gave will cover a lot of pages no doubt |
10:21:53 [tobyink] | Sorry, don't really know what an infobox dump is ?? |
10:23:07 [besbes] | besbes has joined #swig |
10:23:33 [Arnia] | Arnia has joined #swig |
10:24:02 [duryodhan] | http://downloads.dbpedia.org/preview.php?file=3.2_sl_en_sl_infobox_en.nt.bz2 |
10:24:03 [dc_swig] | B: http://downloads.dbpedia.org/preview.php?file=3.2_sl_en_sl_infobox_en.nt.bz2 from duryodhan |
10:24:05 [duryodhan] | http://wiki.dbpedia.org/Downloads32 |
10:24:06 [dc_swig] | C: http://wiki.dbpedia.org/Downloads32 from duryodhan |
10:28:19 [libby] | duryodhan the bot picks up any urls with no space before them on the line (which is fine if you want it to! goes to swig.xmlhack.com) |
10:28:30 [duryodhan] | ohh |
10:28:31 [libby] | sory it's a bit of a pain, but a space before it'll stop it |
10:28:37 [duryodhan] | ok |
10:28:39 [duryodhan] | thanks |
10:28:46 [libby] | :-) |
10:29:32 [jansc_] | jansc_ has joined #swig |
10:29:48 [tobyink] | OK, over 200 MB, not downloading that! But from a sample, it looks like the SPARQL query I provided will mostly work. Probably want to replace foaf:name with dbp:name and foaf:page with dbp:url. |
10:29:49 [jansc_] | jansc_ has quit |
10:32:01 [tobyink] | SPARQL can be run on the infobox dump locally using something like roqet <http://librdf.org/rasqal/roqet.html>. |
10:32:43 [duryodhan] | ohh .. can you walk me through the query you provided ? I am totally new to SPARQL ... |
10:34:04 [Arnia_] | Arnia_ has joined #swig |
10:35:09 [Arnia] | Arnia has quit |
10:35:11 [Arnia_] | Arnia_ is now known as Arnia |
10:35:49 [tobyink] | Are you familiar with SQL though? |
10:36:56 [emrojo] | emrojo has joined #swig |
10:37:02 [sYskk] | sYskk has joined #swig |
10:38:29 [tobyink] | Let's assume you have roqet installed, and your infobox dump is saved with filename "infobox.nt". |
10:39:12 [tobyink] | Then a SPARQL query can be run on that file using the following command-line (I'm assuming here a familiarity with the Unix command line): |
10:39:55 [tobyink] | roqet -i sparql -r json -s 'infobox.nt' -e 'SPARQL GOES HERE' |
10:42:07 [duryodhan] | yeah |
10:42:14 [duryodhan] | I am asking about the SPARQL itself |
10:42:27 [minmax] | minmax has joined #swig |
10:42:29 [duryodhan] | why did you write the query the way you wrote .. what does that ?edu mean |
10:42:30 [tobyink] | We want to find a list of resources "foo" such that the following N-triples statement is correct : foo <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_album> |
10:42:52 [tobyink] | So we use a placeholder/variable called "?foo" |
10:43:14 [tobyink] | SPARQL placeholders are named ?something or $something. The two are interchangeable. |
10:43:37 [tobyink] | SELECT * WHERE { ?foo <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_album> . } |
10:43:46 [duryodhan] | ok that I get now ... |
10:43:50 [duryodhan] | whats the optional ? |
10:43:54 [tobyink] | Or we could be more specific: SELECT ?foo WHERE { ?foo <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_album> . } |
10:44:14 [duryodhan] | that would just give me the name of the page right .. nothing elsE ? |
10:44:28 [tobyink] | OPTIONAL is not supported by roqet (complain to dajobe). It's essentially analagous to a SQL LEFT JOIN. |
10:44:40 [tobyink] | It would give you the URI of the resource. |
10:45:04 [tobyink] | So, say we also wanted to get album cover art... |
10:45:12 [gromgull] | gromgull has quit |
10:45:14 [duryodhan] | forget roqet for now .. just say the dbpedia.org/sparql page |
10:45:23 [duryodhan] | ok |
10:45:50 [duryodhan] | and how do I say to only get me results for USA ? |
10:45:55 [tobyink] | SELECT ?resource ?art WHERE { ?resource <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_album> . ?resource <http://dbpedia.org/property/cover> ?art . } |
10:46:37 [tobyink] | The bit in the WHERE {} clause is essentially a template in Turtle which we want to be true for all results. |
10:46:51 [jansc] | jansc has quit |
10:46:51 [duryodhan] | and the . ? |
10:47:06 [tobyink] | The "?" just says "this is a variable". |
10:47:30 [tobyink] | However, for some albums, perhaps Wikipedia doesn't have the cover art, so no rows would be returned in the output at all. |
10:47:40 [gromgull] | gromgull has joined #swig |
10:48:00 [tobyink] | And that's where OPTIONAL comes in - you're saying that you still want to know about the album, even if it has no cover art. |
10:48:14 [tobyink] | SELECT ?resource ?art WHERE { ?resource <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_album> . OPTIONAL { ?resource <http://dbpedia.org/property/cover> ?art . } } |
10:48:48 [duryodhan] | no no .. the dot .. whats the dot for ? |
10:48:51 [duryodhan] | or is that a new line ? |
10:49:08 [tobyink] | Similar to a SQL left join, where you're saying that you still want rows for the left table even if there is no correponding data in the right table. |
10:49:15 [tobyink] | dot is part of turtle. |
10:49:36 [tobyink] | It marks the end of a triple. |
10:49:46 [duryodhan] | ok ok |
10:49:56 [duryodhan] | and how do I say ... only get me results for Country : USA ? |
10:50:01 [tobyink] | (Though if you've only got one triple in a clause, you can leave it out.) |
10:50:21 [duryodhan] | I would need something like "templateHasCountry" as one of the triple ? |
10:52:05 [LotR] | LotR has quit |
10:52:30 [tobyink] | Something like this... http://sparql.pastebin.com/m7518cf92 |
10:53:00 [LotR] | LotR has joined #swig |
10:53:49 [duryodhan] | but in that query whats the point of the optional part .. if you have done select * .. wouldn't it return everything it has for that entity ? |
10:54:29 [ldodds] | duryodhan: the * means: every variable in the query |
10:54:38 [tobyink] | ldodds: exactly. |
10:54:42 [ldodds] | not every property in the database |
10:55:09 [ldodds] | A triple store doesn't have a data dictionary like a relational database, so you have to match everything you're interested in, in the query pattern, in order to select it |
10:55:53 [duryodhan] | ok ok |
10:56:14 [duryodhan] | and ?edu dbpprop:country http://dbpedia.org/resource/United_States would also restrict to USA ? |
10:56:39 [KjetilK] | * KjetilK had a good moment yesterday when my colleague said "actually, now I suddenly find SPARQL easier to write than SQL" :-) |
10:57:21 [ldodds] | KjetilK: cool :) |
10:59:29 [KjetilK] | duryodhan: yeah, I would think so |
10:59:36 [tobyink] | duryodhan: SPARQL has "DESCRIBE" to do what you want "SELECT *" to do. |
10:59:54 [tobyink] | DESCRIBE's behaviour is a little implementation-specific though. |
11:00:12 [tobyink] | This works on dbpedia.org/sparql and returns a very big document - DESCRIBE ?edu WHERE { ?edu <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:infobox_university> . ?edu <http://dbpedia.org/ontology/country> <http://dbpedia.org/resource/United_States> . } |
11:00:13 [KjetilK] | * KjetilK likes DESCRIBE a lot, it basically says "give me all you know about <foo> in RDF" |
11:00:47 [KjetilK] | I hope we can specify DESCRIBE a bit more in the new SPARQL WG |
11:01:33 [ldodds] | KjetilK: argh, that reminds me that I forgot one of my suggestions to WG |
11:01:48 [mischat] | yvesr: r u there ? |
11:01:48 [aklassen] | aklassen has joined #swig |
11:01:55 [ldodds] | I'll send it in anyway, hope the deadline wasn't too strict :) |
11:02:28 [ldodds] | * ldodds wants a DESCRIBE ... USING ... type feature to specify the algorithm |
11:02:39 [duryodhan] | hmm so theoretically .. my original use case I could do DESCRIBE ?edu WHERE { ?edu foaf:page <url> } ? |
11:02:48 [duryodhan] | for each url I know of |
11:02:59 [KjetilK] | ldodds: ah, yeah, like virtuoso does a "define" in the beginning? |
11:03:28 [ldodds] | not looked, but if it allows you to say I want a bnode closure, or other form of bounding the graph, then yes |
11:03:38 [KjetilK] | ah, yup |
11:03:42 [tobyink] | duryodhan: Assuming that <url> is a Wikipedia page, then yes, that should work. |
11:04:29 [duryodhan] | ok great .. I won't do it .. but I think I am getting a hang of this .. |
11:05:35 [tobyink] | Good, good. The SPARQL spec actually has quite a lot of nice examples <http://www.w3.org/TR/rdf-sparql-query/> though none of them use dbpedia-specific properties, resources, etc. |
11:06:07 [ldodds] | duryodhan: some background on difference between the various SPARQL query forms (SELECT, DESCRIBE, etc): http://www.slideshare.net/ldodds/sparql-query-forms |
11:06:09 [mischat] | phenny, tell yvesr : in the bbc music rdf I think you are using foaf:maker incorrectly, for example here : http://www.bbc.co.uk/music/artists/cc197bad-dc9c-440d-a5b5-d52ba2e14234, <#artist> foaf:made _:bnode, seems like the correct property to use (when talking about records). foaf:maker's domain is an owl:Thing and its range is a foaf:agent, whereas foaf:made is the other way round. |
11:06:09 [phenny] | mischat: I'll pass that on when yvesr is around. |
11:06:53 [duryodhan] | wow that describe is taking a lot of time on virtuoso .. maybe I shouldn't have run it on that ? |
11:07:03 [duryodhan] | I just wanted to see some sample outputs |
11:07:17 [cygri] | cygri has joined #swig |
11:07:59 [tobyink] | DESCRIBE will often return a *lot* of data. |
11:08:15 [tobyink] | There is no way of specifying how much you want. |
11:08:37 [tobyink] | Which is why it's usually best to use SELECT and a few fields that you're interested in. |
11:10:24 [chimezie_] | chimezie_ has joined #swig |
11:11:01 [chimezie] | chimezie has quit |
11:11:33 [duryodhan] | right now to get teh website I am doing ?edu dbpprop:website ?website . .. but this is on virtuoso |
11:11:49 [chimezie_] | chimezie_ is now known as chimezie |
11:11:52 [duryodhan] | will this work offline .. or will I need to use some URI instead of dbpprop:website ? |
11:12:00 [tobyink] | phenny, tell yvesr : I agree with mischat. Should use foaf:made, or perhaps dc:contributor |
11:12:01 [phenny] | tobyink: I'll pass that on when yvesr is around. |
11:12:05 [karlcow] | karlcow has quit |
11:12:24 [mischat] | foaf > dc |
11:12:33 [mischat] | but yeah i agree with you too tobyink ;) |
11:12:43 [danbri] | you could use dc terms namespace |
11:12:48 [Arnia] | Arnia has quit |
11:12:54 [danbri] | the old dc namespace was quite messy re best practice for dc:creator etc |
11:13:02 [danbri] | dc terms it is a bit more ontology-style |
11:13:32 [tobyink] | dbpprop:website relies on virtuoso knowing what "dbpprop:" is a shorthand for. I always tend to use full URIs in SPARQL as defining prefixes doesn't tend to save many bytes. |
11:13:56 [tobyink] | I always use xmlns:dc="http://purl.org/dc/terms/" |
11:14:15 [tobyink] | So when I say dc:contributor, I mean http://purl.org/dc/terms/contributor |
11:15:51 [shellac] | which implies dc elements 1.1 contributor |
11:16:10 [duryodhan] | ok |
11:16:32 [mischat] | ah the joys of URIs |
11:16:58 [justben] | justben has joined #swig |
11:18:23 [tobyink] | phenny, tell yvesr Also, why http://www.perceive.net/schemas/relationship/ ?? The latest version is at this URI - http://purl.org/vocab/relationship/ - and has been for over 5 years |
11:18:24 [phenny] | tobyink: I'll pass that on when yvesr is around. |
11:19:17 [mischat] | ah yves is going to love us now ... |
11:20:09 [mischat] | the relationship vocab doesn't have a nemesis relationship shame really |
11:20:40 [tobyink] | mischat : http://buzzword.org.uk/rdf/xen |
11:21:03 [mischat] | nice |
11:21:37 [tobyink] | iand: http://purl.org/vocab/relationship/ uses owl:equivalentClass when it means owl:equivalentProperty |
11:25:04 [tobyink] | mischat : <html><head profile="http://xen.adactio.com/"></head><a rel="nemesis" href="http://example.com/eve">Eve</a></html> |
11:25:21 [tobyink] | Works in Swignition <http://buzzword.org.uk/swignition/>. |
11:25:41 [mischat] | cool |
11:25:50 [danbri] | * danbri trying to load lcsh.nt into ARC rdf store... |
11:25:56 [hg] | hg has quit |
11:27:23 [tobyink] | e.g. http://examples.tobyinkster.co.uk/nemesis.html --> http://srv.buzzword.org.uk/turtle/examples.tobyinkster.co.uk/nemesis.html |
11:27:46 [KjetilK] | danbri: is lcsh up again in some form? |
11:29:18 [duryodhan] | tobyink: roqet doesn't seem to work .. I think it wants the infobox dump to be an xml ,... whereas this one is just a CSV file |
11:29:28 [jhalv] | jhalv has joined #swig |
11:29:58 [danbri] | Kjetilk, nope |
11:30:07 [tobyink] | Won't work with the CSV version of the infobox dump, but should work with the N-Triples version. |
11:30:30 [danbri] | but i kept a copy for a rainy day ... http://danbri.org/2008/lcsh |
11:30:55 [tobyink] | It should be pretty easy to make N-Triples from CSV though. |
11:30:57 [danbri] | just for experiments etc... not meant to cause trouble at LOC |
11:30:58 [KjetilK] | danbri: ah, ok. And the rainy day came :-( |
11:31:15 [danbri] | i'm sure it'll be back. but in meantime, it's good for experimenting with |
11:31:16 [tobyink] | I think you should be able to just replace the commas with spaces and add a "." to the end of each line. |
11:32:03 [danbri] | eg i started some work to find potential wiki/db-pedia links - http://svn.foaf-project.org/foaftown/2008/lcshplus/lcsh2wiki.rb |
11:32:25 [KjetilK] | danbri: nice! |
11:32:54 [duryodhan] | tobyink: the end of line should be a space followed by a "." or just a "." |
11:32:57 [duryodhan] | ? |
11:33:34 [tobyink] | Don't think it matters, but I'd include a space just to be sure. |
11:34:50 [duryodhan] | cat ~/Desktop/downloads/infobox_en.csv | tr "\t" " " | sed "s/$/ ./g" |
11:35:12 [duryodhan] | should do it |
11:35:20 [tobyink] | Ah, so not really comma-separated, but tab-separated? If so, then the tabs should be fine as is. |
11:35:43 [duryodhan] | * duryodhan is mad |
11:35:45 [duryodhan] | :D |
11:35:48 [duryodhan] | its a 1.5 GB file |
11:35:53 [duryodhan] | screw it .. let it run |
11:36:12 [tobyink] | tr, sed are pretty fast anyway. |
11:36:26 [duryodhan] | yeah |
11:36:40 [duryodhan] | although tr is obscenely faster |
11:36:44 [duryodhan] | afaik |
11:37:54 [duryodhan] | does the online virtuoso sparql limit the size of results ? |
11:38:11 [tobyink] | Yes, it seems to. |
11:38:41 [shellac] | it also cost estimates, to avoid nasty queries |
11:39:07 [libby] | ooh, that's col |
11:39:08 [shellac] | (if you'll excuse the verbing) |
11:39:08 [libby] | cool |
11:39:16 [tobyink] | e.g. I only got back universities starting with the letters A-C here : http://sparql.pastebin.com/m5c438e6c |
11:39:27 [duryodhan] | so how do I ask it not to ? :) |
11:39:38 [duryodhan] | who do beg / borrow/steal from ? |
11:39:42 [duryodhan] | *do I |
11:39:59 [mischat] | can you not use limit and offset and order by in virtuoso |
11:40:01 [tobyink] | kidehen probably, but he's not here right now. |
11:40:03 [shellac] | you could page it yourself |
11:40:13 [shellac] | oh, no limit / offset :-( |
11:40:14 [KjetilK] | I have a query that can kneel Virtuoso, but it is being worked on |
11:40:26 [mischat] | lame |
11:40:31 [shellac] | query for things starting a, then b etc |
11:40:42 [mischat] | double lame |
11:41:11 [shellac] | in fairness how many public full SQL services do you know? |
11:41:20 [mischat] | none |
11:41:24 [mischat] | this is very true |
11:41:28 [tobyink] | better just to download the data dump and query it on your own machine. |
11:41:33 [mischat] | yup |
11:41:49 [emrojo] | emrojo has quit |
11:42:00 [KjetilK] | Peter Fox said public SQL services has been tried, and failed spectacularly |
11:43:07 [shellac] | yahoo or amazon would run xslt for you, but I think they killed it after a fixed time limit |
11:43:29 [BenO] | Btw the LoC thing won't be back up at lcsh.info - in theory, it should appear in the next week at http://id.loc.gov/ (not holding my breath though) |
11:43:41 [iand] | tobyink: thanks for that re relationship equivClass |
11:43:43 [shellac] | running other peoples code is brave |
11:44:16 [duryodhan] | http://pastebin.ca/1358124 whats wrong with that ? |
11:44:17 [dc_swig] | D: http://pastebin.ca/1358124 from duryodhan |
11:44:20 [duryodhan] | aargh |
11:44:27 [duryodhan] | die dc_swig die!! :D |
11:45:08 [duryodhan] | anyhow .. in the query I pasted .. for http://dbpedia.org/page/Hillcrest_Lutheran_Academy I get nothing in city .. but the page clearly lists it |
11:46:03 [tobyink] | iand: no problem. By the way, I've implemented <head profile="http://purl.org/vocab/relationship/"> in Swignition (subversion trunk only, not released yet) using mostly the same code as XFN. |
11:46:55 [duryodhan] | umm roqet died on the infobox dump .. file too large |
11:48:13 [tobyink] | don't put commas in SELECT list - differs from SQL. Should be http://pastebin.ca/1358125 |
11:48:56 [KjetilK] | duryodhan: just put a space or a letter before the URL and it won't fire :-) |
11:49:13 [duryodhan] | kjetilk : I keep forgetting |
11:49:33 [KjetilK] | duryodhan: yeah, everyone forgets it now and then :-) |
11:50:01 [shellac] | BenO: just the link I was totally failing to find on their site. thanks. |
11:50:05 [duryodhan] | tobyink: that still doesn't solve the problem .. the problem is that http://dbpedia.org/page/Hillcrest_Lutheran_Academy points to a page which has a redirect ... |
11:50:11 [duryodhan] | and is otherwise empty |
11:50:15 [duryodhan] | so I don't get a name for the city |
11:52:09 [BenO] | shellac, they say it will hold a replica of what edsu did, but they said 'in 6 -8 weeks' ... about 7 weeks ago |
11:54:12 [chimezie] | chimezie has quit |
11:57:28 [Zach_Beauvais] | Zach_Beauvais has joined #swig |
11:59:15 [mintsauce] | mintsauce has joined #swig |
12:02:52 [myakura] | myakura has joined #swig |
12:04:16 [shepazu] | shepazu has joined #swig |
12:05:02 [shepazutoo] | shepazutoo has quit |
12:07:18 [shepazutoo] | shepazutoo has joined #swig |
12:09:09 [mintsauce] | mintsauce has quit |
12:13:07 [tobyink] | duryodhan: theoretically, this should work - http://pastebin.ca/1358131 - but it seems to cause problems for virtuoso |
12:15:13 [BenO] | BenO has quit |
12:20:04 [jaresty] | jaresty has quit |
12:21:28 [shepazu] | shepazu has quit |
12:21:49 [tobyink] | With that, you'll find that the city name ends up in ?cityname if there's no redirect or ?cityname2 if there's a redirect. |
12:21:49 [tobyink] | (But of course, you don't find that - you find an error message instead.) |
12:22:06 [tobyink] | Though the individual bits of the query work on their own - http://pastebin.ca/1358133 |
12:25:53 [duryodhan] | yeah.. but I think this should have been handled by the dbpedia miner itself ... |
12:25:59 [duryodhan] | if there is only 1 redirect location |
12:26:06 [adi112358] | adi112358 has joined #swig |
12:28:02 [FabGandon] | FabGandon has left #swig |
12:29:29 [tobyink] | Although it does seem that it might be sensible for dbpedia to handle redirects when data mining, it actually won't end up working very well as redirects on Wikipedia are used for two different purposes. |
12:29:51 [tobyink] | 1. In the case you have found, where a page redirects to a synonym page. |
12:30:40 [tobyink] | 2. But also a specific thing will redirect to a more general thing. e.g. "Abe Simpson" might redirect to "Chararacters in the Simpsons" (it doesn't, but that's just an example) |
12:31:27 [duryodhan] | yeah in case 2 it won't work |
12:31:30 [adi112358] | adi112358 has left #swig |
12:32:07 [tobyink] | If, for example, "Fergus Falls" redirected to "Minnesota" you'd end up with "Minnesota" as a city name. |
12:32:41 [tobyink] | So it's useful to explicitly distinguish between a page and all the pages that redirect to it. Even if it's a bit of a pain to traverse redirects in queries. |
12:33:25 [duryodhan] | or maybe the dbpedia infobox miner should directly take the text written there as city name and connect it to a resource seperately ... |
12:33:59 [tobyink] | (Of course, as dbpedia has nice pretty URIs, you could always cheat by just looking at the city's URI and stripping off http://dbpedia.org/resource/ from the front of it.) |
12:35:04 [duryodhan] | ohh are the dbpedia urls different from the urls in wikipedia ? I thought every en.wikipedia.org/wiki/x mapped to en.wikipedia.org/page/x |
12:35:11 [allisterb] | allisterb has quit |
12:35:13 [duryodhan] | *mapped to dbpedia.org/page/x |
12:35:21 [FabGandon] | FabGandon has joined #swig |
12:36:35 [karlcow] | karlcow has joined #swig |
12:39:06 [redduck666] | redduck666 has joined #swig |
12:40:49 [jhalv] | jhalv has quit |
12:41:44 [caedes] | caedes has quit |
12:42:29 [bijan] | bijan has joined #swig |
12:52:37 [mintsauce] | mintsauce has joined #swig |
12:54:51 [nathany] | nathany has joined #swig |
12:59:47 [aklassen] | aklassen has quit |
13:00:16 [aklassen] | aklassen has joined #swig |
13:02:37 [tobyink] | http://en.wikipedia.org/resource/X = URI for Wikipedia's (English) page about X ; http://dbpedia.org/page/X = URI for dbpedia's page about X ; http://dbpedia.org/resource/X = URI for X itself |
13:02:39 [dc_swig] | E: http://en.wikipedia.org/resource/X from tobyink |
13:02:45 [tobyink] | d'oh! |
13:04:07 [tobyink] | A:= |
13:04:08 [dc_swig] | Replacement must be a valid URL. |
13:04:21 [tobyink] | A:=http://example.com |
13:04:23 [dc_swig] | Replaced URL of A. |
13:04:28 [tobyink] | B:=http://example.com |
13:04:30 [dc_swig] | Replaced URL of B. |
13:04:34 [tobyink] | C:=http://example.com |
13:04:36 [dc_swig] | Replaced URL of C. |
13:04:42 [tobyink] | D:=http://example.com |
13:04:44 [dc_swig] | Replaced URL of D. |
13:04:57 [pauld] | pauld has quit |
13:05:00 [tobyink] | E:=http://example.com |
13:05:02 [dc_swig] | Replaced URL of E. |
13:05:06 [harbulot] | harbulot has joined #swig |
13:05:17 [bijan] | http://www.youtube.com/watch?v=Hn-8_J5JGPA |
13:05:19 [dc_swig] | F: http://www.youtube.com/watch?v=Hn-8_J5JGPA from bijan |
13:05:28 [bijan] | F:|A video tour of OWL/XML (part one) |
13:05:30 [dc_swig] | Titled item F. |
13:05:40 [mintsauce] | mintsauce has quit |
13:05:47 [bijan] | F: My attempt to demonstrate the value add of the XML toolchain. |
13:05:48 [dc_swig] | Added comment F1. |
13:05:50 [tobyink] | A:|-accidental- |
13:05:51 [dc_swig] | Titled item A. |
13:05:53 [tobyink] | B:|-accidental- |
13:05:54 [dc_swig] | Titled item B. |
13:05:55 [tobyink] | C:|-accidental- |
13:05:57 [dc_swig] | Titled item C. |
13:05:58 [tobyink] | D:|-accidental- |
13:05:59 [dc_swig] | Titled item D. |
13:06:01 [tobyink] | E:|-accidental- |
13:06:03 [dc_swig] | Titled item E. |
13:06:06 [bijan] | F: Esp. my XML weapon of choice, oXygen. |
13:06:07 [dc_swig] | Added comment F2. |
13:06:23 [mintsauce] | mintsauce has joined #swig |
13:06:43 [bijan] | F: Check out my mad screenflow skillz! |
13:06:45 [dc_swig] | Added comment F3. |
13:06:47 [mintsauce] | mintsauce has quit |
13:07:50 [melvster] | melvster has joined #swig |
13:10:18 [aindilis] | aindilis has joined #swig |
13:12:29 [duryodhan] | yeah but the resource url redirects to page for web browsers |
13:13:45 [jaresty] | jaresty has joined #swig |
13:19:20 [tlr] | tlr has joined #swig |
13:28:43 [tobyink] | duryodhan: <http://dbpedia.org/page/Pizza> is the URI for a page about pizzas. <http://dbpedia.org/resource/Pizza> is the URI for actual pizzas. But because pizzas can't be sent over the wire, dbpedia.org redirects you to a page about pizzas in the hope that it will be adequate. |
13:32:18 [jansc] | jansc has joined #swig |
13:32:39 [danbri] | danbri has quit |
13:34:01 [BenO] | BenO has joined #swig |
13:40:01 [FabGandon] | FabGandon has quit |
13:41:15 [MacTed] | MacTed has joined #swig |
13:43:05 [duryodhan] | yeah ... |
13:43:31 [FabGandon] | FabGandon has joined #swig |
13:49:28 [pauld] | pauld has joined #swig |
13:52:38 [CS_] | CS_ is now known as CaptSolo |
13:52:47 [FabGandon] | FabGandon has quit |
13:53:20 [nathany] | nathany has quit |
13:56:31 [FabGandon] | FabGandon has joined #swig |
14:04:57 [lheuer] | lheuer has joined #swig |
14:06:44 [pauld_] | pauld_ has joined #swig |
14:17:38 [FabGandon] | FabGandon has quit |
14:18:15 [pauld_] | pauld_ has quit |
14:18:41 [FabGandon] | FabGandon has joined #swig |
14:22:39 [eikeon] | eikeon has quit |
14:29:54 [pauld] | pauld has quit |
14:36:54 [FabGandon] | FabGandon has quit |
14:37:49 [FabGandon] | FabGandon has joined #swig |
14:38:18 [shepazu] | shepazu has joined #swig |
14:40:57 [Arnia] | Arnia has joined #swig |
14:46:27 [shepazutoo] | shepazutoo has quit |
14:51:31 [reto] | reto has quit |
14:53:04 [tlr] | tlr has quit |
14:54:04 [mintsauce] | mintsauce has joined #swig |
14:59:01 [Arnia] | Arnia has quit |
15:00:21 [nwalsh] | nwalsh has joined #swig |
15:01:21 [Pipian_] | Pipian_ has joined #swig |
15:02:16 [reto] | reto has joined #swig |
15:03:32 [IvanHerman] | IvanHerman has quit |
15:06:41 [Arnia] | Arnia has joined #swig |
15:08:16 [danbri] | danbri has joined #swig |
15:08:40 [caedes] | caedes has joined #swig |
15:09:04 [nathany] | nathany has joined #swig |
15:13:24 [chimezie] | chimezie has joined #swig |
15:17:42 [pauld] | pauld has joined #swig |
15:24:52 [pauld] | pauld has quit |
15:27:52 [jansc] | jansc has quit |
15:30:01 [cbichis] | cbichis has joined #swig |
15:30:17 [cbichis] | cbichis has left #swig |
15:37:16 [KjetilK] | tobyink: did you add your IN operator suggestion to the SPARQL wiki, or only to the mailing list? |
15:38:29 [allisterb] | allisterb has joined #swig |
15:38:34 [mintsauce] | mintsauce has quit |
15:38:41 [gromgull] | gromgull has quit |
15:38:42 [tobyink] | mailing list only. |
15:41:00 [KjetilK] | OK! |
15:41:09 [KjetilK] | * KjetilK thinks it is an important feature :-) |
15:42:18 [reto] | reto has quit |
15:43:52 [aindilis] | aindilis has quit |
15:44:13 [aindilis] | aindilis has joined #swig |
15:44:24 [jsoltren] | jsoltren has joined #swig |
15:45:03 [leobard] | leobard has joined #swig |
15:45:23 [tobyink] | I think it is useful not only as a way of reducing long sets of "or" operators, but also as a way of natively handling rdf:Lists in SPARQL. rdf:List seems to be a lot more popular than rdf:Bag, rdf:Alt and rdf:Seq, but are compatatively complicated under the hood. |
15:46:33 [KjetilK] | yeah, |
15:46:55 [mintsauce] | mintsauce has joined #swig |
15:46:58 [KjetilK] | if you'd like to delete a bunch of triples with different subjects it is really useful |
15:46:59 [tobyink] | An alternative syntax to the one I originally proposed would be to treat "IN" as a pseudo-predicate such that the SPARQL engine would automatically infer { _:A IN _:L } for every node _:A in list _:L. |
15:47:02 [KjetilK] | a must, I would say |
15:47:39 [KjetilK] | we're using it a lot allready, since it is implemented in Virtuoso |
15:47:51 [KjetilK] | ah, yeah, that's an option |
15:47:55 [tobyink] | It could be argued that such a thing would already be allowed by SPARQL engines using entailment, but it would be good if it were available in the baseline standard. |
15:48:09 [KjetilK] | yup |
15:48:49 [tobyink] | But what do I know... I only really learnt SPARQL a few weeks ago... |
15:50:46 [timbl] | timbl has joined #swig |
15:53:27 [ldodds] | ldodds has left #swig |
15:54:43 [LeeF] | KjetilK, it's on the wiki, on the "surface syntax" page I believe |
15:55:20 [KjetilK] | LeeF: oh, cool |
15:56:03 [IvanHerman] | IvanHerman has joined #swig |
15:56:28 [jsoltren] | jsoltren has quit |
16:00:31 [besbes] | besbes has quit |
16:09:48 [pesla] | pesla has quit |
16:11:35 [timbl] | http://www.w3.org/DesignIssues/NoSnooping.html |
16:11:36 [dc_swig] | G: http://www.w3.org/DesignIssues/NoSnooping.html from timbl |
16:13:30 [timbl] | G:|No Snooping! |
16:13:32 [dc_swig] | Titled item G. |
16:14:13 [timbl] | G: The act of reading, like the act of writing, is a pure, fundamendal, human act. It must be available without interference or spying. |
16:14:14 [dc_swig] | Added comment G1. |
16:15:31 [Zach_Beauvais] | Zach_Beauvais has quit |
16:17:18 [mischat] | timbl: s/wirtetapping/wiretapping/ |
16:17:38 [Pipian_] | Pipian_ has quit |
16:19:55 [monkeyiq] | monkeyiq has quit |
16:22:27 [mischat] | i guess all this talk of deep packet inspection, is why the new scientist has decided to publish an article about using noise to mask web results. All very scary http://www.newscientist.com/article/mg20126986.000-noise-could-mask-web-searchers-ids.html?DCMP=OTC-rss&nsref=online-news . horrible, reminds me of http://en.wikipedia.org/wiki/SIGSALY |
16:23:53 [bengee] | not sure, but could putting "STD", "cancer", and "abhorrent" "political views" together with "homosexual" in one sentence about crisis upset some people? |
16:24:40 [besbes] | besbes has joined #swig |
16:25:15 [_psychic_] | _psychic_ has joined #swig |
16:27:15 [myakura] | myakura has quit |
16:28:16 [mischat] | i dont think it is wrong, its a matter of privacy, and in some cases something which need not be discussed in public. It should be noted that in some places in the world, homosexuality is still shunned, and people living in such places need to be protected too. |
16:28:28 [mischat] | s/need/needs/ |
16:29:51 [KjetilK] | yeah, there are places where people are executed for it |
16:31:27 [mischat] | I lived in a country where people are still hung for homosexuality: see http://news.bbc.co.uk/1/hi/world/middle_east/4725959.stm |
16:31:54 [mischat] | s/hung/hanged/ |
16:31:55 [bengee] | right, but I didn't mean the semantics, just the wording |
16:32:05 [mischat] | :) |
16:32:20 [mischat] | nosnooping++; |
16:32:23 [tobyink] | http://buzzword.org.uk/2009/stylish-rdfa/test.html |
16:32:25 [dc_swig] | H: http://buzzword.org.uk/2009/stylish-rdfa/test.html from tobyink |
16:32:31 [tobyink] | H:|Stylish RDFa |
16:32:33 [dc_swig] | Titled item H. |
16:32:37 [danbri] | it also puts extra expectations on social network owners to carefully test their UI for people whose native language isn't the one they're reading |
16:32:57 [danbri] | (I complained to Google/Orkut about this a few years ago) |
16:33:06 [tobyink] | H: Allows CSS selectors to select elements based on RDFa attribute values' *full* URIs (not CURIEs). |
16:33:08 [dc_swig] | Added comment H1. |
16:33:31 [tobyink] | H: e.g. *[uri-property~="http://purl.org/dc/terms/title"] { color: #090; background: #cfc; } |
16:33:33 [dc_swig] | Added comment H2. |
16:34:05 [FabGandon] | FabGandon has left #swig |
16:34:07 [mischat] | danbri: still on to do list to farsi the foaf ontology |
16:34:16 [danbri] | yes please :) |
16:36:32 [Wikier] | Wikier has quit |
16:37:41 [emrojo] | emrojo has joined #swig |
16:40:39 [pauld] | pauld has joined #swig |
16:41:15 [tobyink] | H: Essentially the inverse of [http://buzzword.org.uk/2008/rdf-ease/spec|RDF-EASE]. |
16:41:17 [dc_swig] | Added comment H3. |
16:42:24 [timbl] | timbl has quit |
16:43:59 [monkeyiq] | monkeyiq has joined #swig |
16:43:59 [mintsauce] | mintsauce has quit |
16:44:27 [mintsauce] | mintsauce has joined #swig |
16:47:18 [tobyink] | dajobe - this looks dead <http://chatlogs.planetrdf.com/swig/2009-03-11.html> |
16:49:43 [tobyink] | http://buzzword.org.uk/2009/logger-sioc-rdfa/swig/2009-03-11 |
16:49:45 [dc_swig] | I: http://buzzword.org.uk/2009/logger-sioc-rdfa/swig/2009-03-11 from tobyink |
16:49:51 [tobyink] | I:|Today's Chat Logs |
16:49:53 [dc_swig] | Titled item I. |
16:50:22 [tobyink] | I:[http://chatlogs.planetrdf.com/swig/2009-03-11.html|The usual ones] seem to have died in the wee small hours of the morning. |
16:50:23 [dc_swig] | Added comment I1. |
16:50:46 [tobyink] | I:Luckily I've been testing my own logger, so happen to have logs for today. |
16:50:48 [dc_swig] | Added comment I2. |
16:51:32 [oshani] | oshani has joined #swig |
16:53:44 [nwalsh] | nwalsh has quit |
16:55:37 [ephemerian] | ephemerian has quit |
16:57:01 [pauld] | pauld has quit |
16:59:20 [allisterb_] | allisterb_ has joined #swig |
17:00:22 [besbes] | besbes has quit |
17:00:36 [besbes] | besbes has joined #swig |
17:01:38 [FabGandon] | FabGandon has joined #swig |
17:03:04 [_psychic__] | _psychic__ has joined #swig |
17:03:05 [jsoltren] | jsoltren has joined #swig |
17:04:05 [KjetilK] | http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0037 |
17:04:06 [dc_swig] | J: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-0037 from KjetilK |
17:04:23 [KjetilK] | J: Curl arbitrary file access vulnerability |
17:04:24 [dc_swig] | Added comment J1. |
17:04:27 [KjetilK] | J:| Curl arbitrary file access vulnerability |
17:04:28 [dc_swig] | Titled item J. |
17:04:53 [jaimico] | jaimico has joined #swig |
17:05:11 [KjetilK] | J1: Given curl's popularity among semwebbers, this one is worth a heads-up |
17:05:12 [dc_swig] | Replaced comment J1. |
17:05:52 [mintsauce] | mintsauce has quit |
17:06:04 [justben] | If I’m reading it right, it only applies with -L, yes? |
17:06:43 [KjetilK] | hmmm, I thought it is a compile-time option? |
17:07:51 [KjetilK] | it is probably most problematic if libcurl is used in an application |
17:10:44 [justben] | curl has their own advisory, including upgrade/patch instructions, at http://curl.haxx.se/docs/adv_20090303.html |
17:15:12 [besbes_] | besbes_ has joined #swig |
17:15:38 [allisterb] | allisterb has quit |
17:16:54 [_psychic_] | _psychic_ has quit |
17:17:32 [KjetilK] | justben: please add a comment to the chump! :-) |
17:19:05 [justben] | I haven’t been around the channel long; where can I read docs for that bot? |
17:19:29 [KjetilK] | actually, I don't know, but you can just say J: and your comment |
17:20:13 [iand] | iand has quit |
17:20:19 [KjetilK] | ah: http://usefulinc.com/chump/MANUAL.txt |
17:21:04 [KjetilK] | dc:help |
17:21:25 [KjetilK] | hmmm |
17:21:26 [_psychic__] | _psychic__ has quit |
17:21:48 [mischat] | dc_swig: help |
17:21:57 [FabGandon] | FabGandon has quit |
17:21:57 [dc_swig] | Post a URL by saying it on a line on its own |
17:21:57 [dc_swig] | To post an item without a URL, say BLURB:This is the title |
17:21:58 [dc_swig] | I will reply with a label, for example A |
17:21:58 [dc_swig] | You can then append comments by saying A:This is a comment |
17:21:58 [dc_swig] | To title a link, use a pipe as the first character of the comment |
17:21:58 [dc_swig] | Eg. A:|This is the title |
17:22:00 [dc_swig] | To see the last 5 links posted, say dc_swig:view |
17:22:02 [dc_swig] | For more features, say dc_swig:morehelp |
17:22:46 [duryodhan] | duryodhan has quit |
17:22:53 [justben] | J: Official advisory including upgrade/patch instructions at http://curl.haxx.se/docs/adv_20090303.html |
17:22:55 [dc_swig] | Added comment J2. |
17:24:18 [mischat] | if you put a pipe after J: you get a title |
17:25:40 [besbes] | besbes has quit |
17:25:47 [_psychic_] | _psychic_ has joined #swig |
17:27:02 [justben] | Huh. Given the channel I figured he’d be storing triples somewhere on the backend. |
17:32:03 [libby] | it does, justben, rss 1.0 http://swig.xmlhack.com/index.rss (retro!) |
17:33:30 [mischat] | or here : http://chatlogs.planetrdf.com/swig/2009-02-04.rdf |
17:35:25 [tobyink] | tobyink has quit |
17:35:27 [emrojo] | emrojo has quit |
17:36:27 [justben] | Sorry, I spoke imprecisely: I was hoping he’d be storing *arbitrary* triples on the backend somewhere. Unless I’m reading the source wrong (twisted! ack!), it’s just generating predefined ones. |
17:40:09 [bengee] | bengee has quit |
17:40:38 [pauld] | pauld has joined #swig |
17:41:34 [cygri] | cygri has quit |
17:43:55 [jaimico] | jaimico has quit |
17:50:07 [justben] | Who runs dc_swig, anyway? I don’t know when I’ll have time to hack in such a feature, but if I ever do it’ll be handy to know whom to contact :-) |
17:51:40 [Arnia] | Arnia has quit |
17:54:04 [BenO] | BenO has quit |
17:58:12 [evlist] | evlist has quit |
18:08:11 [jaimico] | jaimico has joined #swig |
18:11:26 [Freso] | Freso has joined #swig |
18:13:30 [lheuer] | lheuer has quit |
18:15:33 [minmax] | minmax has quit |
18:16:52 [KjetilK] | justben: usefulinc.com is edd's domain, so I guess he is a likely suspect |
18:18:49 [justben] | Thanks. His name’s also all over the changelog, so good chance he’s the one. |
18:20:55 [pauld] | pauld has quit |
18:22:37 [Arnia] | Arnia has joined #swig |
18:25:54 [jaimico] | jaimico has quit |
18:28:44 [evlist] | evlist has joined #swig |
18:30:25 [IvanHerman] | IvanHerman has quit |
18:30:42 [bijan] | bijan has quit |
18:33:34 [Pipian] | Pipian has joined #swig |
18:34:10 [mischat] | mischat has quit |
18:34:22 [Freso] | Freso has quit |
18:46:18 [Arnia_] | Arnia_ has joined #swig |
18:48:55 [Zach_Beauvais] | Zach_Beauvais has joined #swig |
18:54:57 [swh] | swh has quit |
19:01:21 [Arnia] | Arnia has quit |
19:01:32 [dulanov] | dulanov has joined #swig |
19:08:16 [beobal] | beobal has quit |
19:08:17 [besbes_] | besbes_ has quit |
19:09:33 [reto] | reto has joined #swig |
19:12:04 [jansc] | jansc has joined #swig |
19:12:09 [danbri] | http://www.sunlightlabs.com/appsforamerica/ |
19:12:10 [dc_swig] | K: http://www.sunlightlabs.com/appsforamerica/ from danbri |
19:12:33 [danbri] | K:|Sunlight Labs - development contest |
19:12:35 [dc_swig] | Titled item K. |
19:45:37 [dajobe] | crap, logger gone |
19:46:02 [twanj] | twanj has joined #swig |
19:46:05 [logger] | logger has joined #swig |
19:46:05 [logger] | * logger is logging |
19:46:58 [dajobe] | justben: I run dc_swig too |
19:49:50 [mhausenblas] | dajobe: you run planetrdf, right? |
19:49:54 [dajobe] | yes |
19:50:02 [justben] | Thanks! |
19:50:04 [dajobe] | I am ur swig support guy ;) |
19:50:14 [mhausenblas] | I sent a mail re adding a blog a bit ... |
19:50:21 [mhausenblas] | no reaction so far :( |
19:50:24 [dajobe] | I vaguely remember that |
19:50:31 [mhausenblas] | ah |
19:50:37 [mhausenblas] | which means? :) |
19:51:01 [dajobe] | looks relevant |
19:51:05 [dajobe] | but http://webofdata.wordpress.com/feed/ is not rss 1.0 |
19:51:09 [mhausenblas] | ah |
19:51:13 [mhausenblas] | lemme check |
19:51:15 [dajobe] | http://webofdata.wordpress.com/feed/rdf/ |
19:51:16 [dajobe] | is |
19:51:34 [dajobe] | i shall add it now |
19:51:45 [mhausenblas] | thanks! |
19:54:15 [dajobe] | it should be pulled in within 20mins |
19:54:52 [mhausenblas] | great! thanks! |
19:56:06 [Arnia_] | Arnia_ is now known as Arnia |
20:00:59 [mischat] | mischat has joined #swig |
20:03:36 [Paul_Miller] | Paul_Miller has quit |
20:11:13 [ldodds] | ldodds has joined #swig |
20:14:35 [kasei] | cygri: around? |
20:14:54 [kasei] | ah, not even logged in... |
20:16:16 [kasei] | phenny, tell cygri would you consider easing the restriction on prefix.cc to allow non-URLs? I want to add the jena library java: URI. |
20:16:17 [phenny] | kasei: I'll pass that on when cygri is around. |
20:20:35 [ldodds] | ldodds has left #swig |
20:26:58 [KjetilK] | kasei: I was thinking about partitioning of data across hosts, and I was wondering if that's something hexastore could easily do? |
20:33:49 [kasei] | umm... in what way? partitioning the triples? |
20:34:40 [kasei] | heya, btw :) meant to say hello on the dawg chat yesterday... |
20:35:51 [twanj] | twanj has quit |
20:36:21 [twanj] | twanj has joined #swig |
20:37:20 [kasei] | i suspect you could get clever in designing a way of partitioning the hexastore indexes, but I think you'd still have to have the top of the index trees on a central computer. |
20:42:42 [mhausenblas] | good night, Web of Data |
20:43:24 [kasei] | night mhausenblas |
20:44:19 [mhausenblas] | mhausenblas has quit |
20:45:14 [besbes] | besbes has joined #swig |
20:46:18 [KjetilK] | hei kasei, indeed! :-) |
20:46:55 [besbes] | besbes has quit |
20:47:12 [KjetilK] | actually, I don't care so much in what way they can be partitioned, as long as it would be possible to a certain extent scale horizontally |
20:48:06 [kasei] | well, the trivial thing would be to have actual partitions, and just have a hexastore per node. |
20:48:27 [kasei] | in that case, you could use any existing or new partitioning scheme, just using hexastore asa the node-level triplestore. |
20:48:39 [KjetilK] | I noticed a little while ago that the cost of RAM nowadays is NOK 70/GB, around €10 |
20:48:53 [KjetilK] | hmmmm, interesting |
20:48:55 [kasei] | all the ways I'm imagining trying to build distribution into the actual hexastore code is much more complicated. |
20:49:53 [KjetilK] | right :-) |
20:51:10 [KjetilK] | I've been thinking in terms of keeping the model in RAM in a High Availability solution, and just serialise the model now and then for backup from a node at a time when there is low traffic |
20:51:32 [kasei] | the only interesting think about hexastore is that you've got all 6 indexes. aside from that, what you're asking is how to distribute a rather standard B+ tree index over triples. |
20:52:01 [kasei] | might be existing literature here, but i'm not familiar with it. |
20:53:13 [aklassen] | aklassen has quit |
20:54:13 [KjetilK] | * KjetilK has no idea what he's actually asking :-) |
20:54:40 [kasei] | heh |
20:54:59 [chimezie] | chimezie has quit |
20:55:16 [eikeon] | eikeon has joined #swig |
20:57:36 [MacTed] | MacTed has quit |
21:02:46 [KjetilK] | but, well, cheap 64 GB motherboard: http://www.tyan.com/product_board_detail.aspx?pid=445 stuff that with two quad-core Xeon's and cheap DDR2 RAM, I would think that you'd have a pretty solid database server for under NOK 30000, i.e. USD 4500...? |
21:03:29 [jaresty] | jaresty has quit |
21:03:35 [harbulot] | harbulot has quit |
21:05:09 [Jerub] | this guy is asking "what is rdf, and how can I reimplement my code to leverage it" but doesn't know it yet: http://stackoverflow.com/questions/635483/what-is-the-best-way-to-implement-nested-dictionaries-in-python |
21:05:52 [mun] | mun has quit |
21:06:19 [mun] | mun has joined #swig |
21:11:50 [kasei] | Jerub: RDF, or SQL, or XSL, or ... |
21:15:36 [KjetilK] | kasei: is there a lot of denormalisation in hexastore? |
21:17:29 [KjetilK] | just as a order of the magnitude estimate, if the average URI is 30 bytes and the average literal is 150 bytes, how many could I fit in 64 GB RAM? |
21:19:14 [kasei] | unclear |
21:19:30 [kasei] | are you asking about the official hexastore? or mine? or some ideal implementation? :) |
21:19:41 [leobard] | leobard has quit |
21:19:51 [KjetilK] | yours, that's the interesting one :-) |
21:19:55 [jansc] | jansc has quit |
21:19:59 [kasei] | all of this is acting on integer ids for the nodes, so the size of the URIs and literals doesn't matter -- they're stored once. |
21:20:21 [KjetilK] | ah, ok |
21:20:45 [kasei] | I haven't spent a lot of time looking into storage size just yet... the storage overhead is huge for small datasets, and expected to shrink (proportionately) on large datasets |
21:21:25 [kasei] | trouble is that the overhead is inversely proportional to the loading time... as described in the hx paper, loading time is unbearably slow. |
21:21:32 [KjetilK] | so, the idea of having a large dataset in RAM because it is cheap is actually a good one? |
21:21:35 [kasei] | (but overhead is very small) |
21:22:04 [KjetilK] | hmmmm, so how about random inserts? |
21:22:13 [karlcow] | karlcow has quit |
21:22:37 [kasei] | this is the issue. random inserts are reasonably fast on mine (distinct from the description in the paper) |
21:22:49 [KjetilK] | ok, nice |
21:22:50 [kasei] | the speed comes at the cost of the aforementioned space overhead. |
21:23:08 [KjetilK] | ah, ok |
21:23:38 [kasei] | i'd like to look around at existing in-memory stores and see how much space they use/how many triples they can load, but I'm afraid I don't have the spare time for it... |
21:24:09 [KjetilK] | so, the typical SPARQL Update query is ok, but the recover from backup would be slow? |
21:24:25 [kasei] | yeah |
21:24:41 [kasei] | that's probably true of most triplestores, though. |
21:24:44 [KjetilK] | ok, but that doesn't sound too bad :-) |
21:24:48 [KjetilK] | yup |
21:25:30 [kasei] | i spent a lot of time on the code the past couple of weeks, and now am taking something of a break from it... |
21:25:48 [kasei] | got bogged down in implementing full graph pattern matching, and burned myself out for a bit. |
21:25:54 [KjetilK] | ok, is there anything on CPAN? |
21:26:14 [kasei] | it's all in git... remember, though, that this doesn't have any perl bindings yet. |
21:26:26 [KjetilK] | yeah, dangerous stuff it is |
21:26:28 [KjetilK] | cool |
21:26:34 [jansc] | jansc has joined #swig |
21:26:38 [KjetilK] | oh, you have started the C version of it? |
21:27:06 [kasei] | yeah, that's pretty much all i've been working on recently. the perl versioni was more of a proof of concept. |
21:27:15 [KjetilK] | yup |
21:27:57 [kasei] | somewhat confusingly, though, if you get the source from github, the C version is in a src/ directory inside the perl impl. directory... |
21:35:02 [evlist] | evlist has quit |
21:37:10 [mischat] | mischat has quit |
21:44:46 [mischat] | mischat has joined #swig |
21:46:37 [nathany] | nathany has quit |
21:48:16 [mischat_] | mischat_ has joined #swig |
21:50:09 [jaresty] | jaresty has joined #swig |
21:50:47 [dulanov] | dulanov has quit |
21:53:59 [jansc] | jansc has quit |
21:54:46 [mischat] | mischat has quit |
21:57:56 [Freso] | Freso has joined #swig |
22:04:07 [chimezie] | chimezie has joined #swig |
22:13:06 [evlist] | evlist has joined #swig |
22:24:59 [pauld] | pauld has joined #swig |
22:31:01 [pauld_] | pauld_ has joined #swig |
22:36:12 [minmax] | minmax has joined #swig |
22:39:43 [pauld_] | pauld_ has quit |
22:40:02 [Zach_Beauvais] | Zach_Beauvais has quit |
22:40:55 [hg] | hg has joined #swig |
22:49:19 [pauld_] | pauld_ has joined #swig |
22:50:41 [Shepard] | in a knowledge base, do rules belong to the ABox, the TBox or neither? |
22:52:32 [pauld] | pauld has quit |
22:53:41 [CloCkWeRX] | CloCkWeRX has joined #swig |
22:54:43 [karlcow] | karlcow has joined #swig |
22:56:07 [reto] | reto has quit |
22:56:48 [pauld_] | pauld_ has quit |
22:57:41 [CloCkWeRX] | CloCkWeRX has left #swig |
22:58:27 [besbes] | besbes has joined #swig |
22:58:58 [chimezie] | Shepard:TBox |
23:00:13 [justben] | justben has quit |
23:00:35 [jsoltren] | jsoltren has quit |
23:00:39 [besbes] | besbes has quit |
23:01:47 [MacTed] | MacTed has joined #swig |
23:03:21 [minmax_] | minmax_ has joined #swig |
23:04:13 [Freso] | Freso has quit |
23:04:39 [minmax] | minmax has quit |
23:08:22 [Shepard] | chimezie: thanks |
23:14:39 [chimezie] | np |
23:22:21 [mischat_] | mischat_ has quit |
23:23:26 [pauld] | pauld has joined #swig |
23:24:49 [_psychic_] | _psychic_ has quit |
23:28:27 [aklassen] | aklassen has joined #swig |
23:29:08 [pauld] | pauld has quit |
23:35:41 [pauld] | pauld has joined #swig |
23:43:52 [pauld] | pauld has quit |
23:44:43 [minmax_] | minmax_ has quit |
23:46:57 [minmax_] | minmax_ has joined #swig |
23:56:26 [melvster] | melvster has quit |
23:57:52 [harbulot] | harbulot has joined #swig |