[TRU Research] Web App Data Schema

Jim Walseth jim.walseth at gmail.com
Sun Sep 8 13:38:05 PDT 2019


Harry, thanks for the suggestion about the filled leaf, and the icon
source, both would make things easier.

Katie, I will try to "express" some of the other attributes you mention.

Regarding embedding, *attached* is the 'embedding logic' for DRAFT 3.



On Sun, Sep 8, 2019 at 1:22 PM Katie Wilson <katie at transitriders.org> wrote:

> Hi all,
>
> Sorry to be slow on this stuff, I’m very behind…
>
> I kind of feel like at this point we might not need the leaves, since that
> information is expressed already by both the length and color of the bars.
> But if you want to play around with leaves and see what it looks like, go
> for it.
>
> Either way, I do feel like we could/should have badges for the other
> notable characteristics, e.g.:
>
>
>
>    - *Tortoise:* No pre-tax option
>    - *Uneven scales: *Employer that offers much better transit benefits
>    to some workers than others
>    - *Car: *Subsidizes parking but not transit, or parking more than
>    transit
>    - *Gold ribbon: *Industry leader or small business that goes above and
>    beyond
>
>
> Harry, re descriptions— I'm planning to go through and strip those down to
> just the info about what benefits are actually provided. I don’t think we
> want quotes in there, let alone quotes that identify individuals!
>
> I’m hoping to have some time a little later this afternoon to add a column
> to the spreadsheet for monthly cost (right now the bar length is based on
> the ranking and other columns, so definitely not 100% accurate) and clean
> up the entries that are ready to go.
>
> On another note, Jim, how will I be able to go about embedding this in our
> campaign website?? It’s a squarespace site. I can probably share a draft of
> that today or tomorrow, too.
>
> - Katie
>
> On Sep 8, 2019, at 1:02 PM, Harry Maher <harryb.maher at gmail.com> wrote:
>
> Hi Jim,
>
> Cool! Badges would make it pretty if it's possible; I think leaves would
> be cool. Maybe 0 to 4? That might require someone with more graphic design
> experience, but 5 diff images with various levels of filled-in leaves would
> be awesome and they'd probably need to get hard-coded into whatever
> database tableau requires.
>
> This may be helpful:
> https://icons8.com/icons/set/leaf
> (Here are 2 images of leaves to potentially use as empty and filled-in
> leaves, but they have other leaf options and it's easy to fill and stroke
> with diff colors and etc.)
> <icons8-leaf-60.png><icons8-leaf-60(1).png>
>
> Quick note--before making this public on our website, we should scrub some
> details from the Robert Half quote. My friend's quote identifies him as *the
> *accountant for RH currently contracted at UWP, which could potentially
> put him in an awkward position.* Can we just get rid of that second
> sentence* that says he's getting a card through UWP? It identifies him to
> Robert Half and it's a confusing and unhelpful quote out of context anyway.
>
> Thanks!
> -Harry
>
> On Sun, Sep 8, 2019 at 12:24 PM Jim Walseth <jim.walseth at gmail.com> wrote:
>
>> Hi all,
>>
>> Is a visualization with badges still a goal (e.g., leaves)? I was just
>> sitting down to try and figure that out.
>>
>> Certainly would be nice to have something "jazzier" than the current
>> draft:
>>
>> https://public.tableau.com/profile/jwalseth#!/vizhome/CosttoWorkDRAFT3/CosttoWorkbyNeighborhoodandIndustry
>>
>> Jim
>>
>> On Thu, Sep 5, 2019 at 12:11 PM Tom Chartrand <tmchartrand at gmail.com>
>> wrote:
>>
>>> I'm also past a busy time at work now and could put in some time on this
>>> over the weekend. Katie, let me know if there's some chunk it would be easy
>>> to delegate, like getting some additional set into the spreadsheet.
>>> -Tom
>>>
>>> On Thu, Sep 5, 2019 at 12:07 PM Katie Wilson <katie at transitriders.org>
>>> wrote:
>>>
>>>> I think we’re in pretty good shape— Jim has been mocking up some
>>>> visuals and we’ll want to add a numerical “monthly cost” column (instead of
>>>> or in addition to the rating) but I think I can do that.
>>>>
>>>> Thanks for all your work on this, and hope you’re doing ok Stephen!
>>>>
>>>> On Sep 5, 2019, at 11:42 AM, Stephen DeSanto <rachidian at gmail.com>
>>>> wrote:
>>>>
>>>> Hi friends, happy Thursday. I am dealing with some personal issues
>>>> that's preventing me from dedicating a lot of time and energy to this
>>>> project right now. I hope you all can get this over the finish line.
>>>>
>>>> In solidarity,
>>>>
>>>> Stephen
>>>>
>>>>
>>>> On Tue, Aug 27, 2019 at 4:23 PM Katie Wilson <katie at transitriders.org>
>>>> wrote:
>>>>
>>>>> Great! I will head over there at 5:30. I’ll grab the conference room
>>>>> if it’s free, otherwise I’ll just find us a table.
>>>>>
>>>>> On Aug 27, 2019, at 12:18 PM, Jim Walseth <jim.walseth at gmail.com>
>>>>> wrote:
>>>>>
>>>>> I should be able to attend as well. Cheers, -Jim
>>>>>
>>>>> On Mon, Aug 26, 2019 at 3:32 PM Stephen DeSanto <rachidian at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Victrola is fine! 5:30 works for me. Gives me time to get there after
>>>>>> work.
>>>>>>
>>>>>> On Mon, Aug 26, 2019 at 3:09 PM Katie Wilson <katie at transitriders.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Cool, does Victrola on 15th work again, or is somewhere else better?
>>>>>>> Having an actual meeting, even if a small one, will help me to sit down and
>>>>>>> focus on this instead of getting distracted by other things. 5pm good, or
>>>>>>> 5:30 or 6?
>>>>>>>
>>>>>>> On Aug 26, 2019, at 9:38 AM, Stephen DeSanto <rachidian at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I'm also available Tues evening. Otherwise I'll keep making updates
>>>>>>> from home. :)
>>>>>>>
>>>>>>> On Sun, Aug 25, 2019 at 4:43 PM Katie Wilson <
>>>>>>> katie at transitriders.org> wrote:
>>>>>>>
>>>>>>>> This is awesome, thank you Stephen. I put some thoughts in-line in
>>>>>>>> red below, and attached a hard-to-interpret spreadsheet with info
>>>>>>>> about Business Choice participants.
>>>>>>>>
>>>>>>>> I will try to schedule some time this week to start completing rows
>>>>>>>> for the businesses I’m sure about based on the info we have. If anyone
>>>>>>>> wants to get together for a spreadsheet workparty this week let me know, I
>>>>>>>> have time Tuesday and Friday evenings.
>>>>>>>>
>>>>>>>> Katie
>>>>>>>>
>>>>>>>> On Aug 25, 2019, at 3:37 PM, Stephen DeSanto <rachidian at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi everyone. Made a few updates to the master list of employers
>>>>>>>> <https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing>
>>>>>>>> for the upcoming website:
>>>>>>>>
>>>>>>>>    - Added companies and transit benefits raw descriptions from
>>>>>>>>    TRU survey data
>>>>>>>>    - Added "Likely CTR Targets":
>>>>>>>>    https://seattletransitpasses-research.pbworks.com/w/page/133438365/Likely%20Target%20Assessment
>>>>>>>>    - Added "Potential CTR Targets":
>>>>>>>>    https://seattletransitpasses-research.pbworks.com/w/page/133437828/Potential%20CTR%20Targets
>>>>>>>>    - "Likely" / high-profile targets (hotels, banks) are
>>>>>>>>    highlighted in ORANGE
>>>>>>>>    - Added "Potential Poster Children":
>>>>>>>>    https://seattletransitpasses-research.pbworks.com/w/page/133439169/Potential%20Poster%20Children
>>>>>>>>    - Poster children are highlighted in GREEN
>>>>>>>>
>>>>>>>> Highlight colors are just to make it easier to find rows that a)
>>>>>>>> someone said should be included in the list, and b) probably needs benefits
>>>>>>>> data incorporated
>>>>>>>>
>>>>>>>> Things to be done:
>>>>>>>>
>>>>>>>>    - Normalize locations data For this and the “leaf scores” and
>>>>>>>>    the “polluter” columns, I’d be inclined to do this after we’ve
>>>>>>>>    got enough info to check the “publish” box. I could be wrong, but I feel
>>>>>>>>    like it will be less work that way.
>>>>>>>>    - Assign "leaf scores" to all companies that don't have one
>>>>>>>>    - Assign "polluter" etc badges to companies we want to
>>>>>>>>    name&shame
>>>>>>>>    - Add all of the hotels?
>>>>>>>>    https://seattletransitpasses-research.pbworks.com/w/page/133666440/Hotels
>>>>>>>>     Yeah let’s go ahead and add them...
>>>>>>>>    - Add Choice participants? Good question. I did get info back
>>>>>>>>    from Metro on what products the choice participants are buying, I can’t
>>>>>>>>    remember whether I shared that with you all. Anyway, it’s attached. It’s
>>>>>>>>    actually a little hard to interpret (I got a tutorial from a Metro staffer)
>>>>>>>>    so I can try to explain by phone or in person if someone wants to dig
>>>>>>>>    throughthat. Maybe it makes sense to look through that info and add
>>>>>>>>    businesses selectively as we feel like we have a grasp on their programs.
>>>>>>>>    - Add column for Commute Seattle participants?
>>>>>>>>    https://seattletransitpasses-research.pbworks.com/w/page/133438167/Commute%20Seattle%20List%20of%20Passport%20Participants I
>>>>>>>>    don’t think we need to do this, because this info is most likely
>>>>>>>>    duplicative with what we learned from Metro about passport participants.
>>>>>>>>    The Commute Seattle list doesn’t tell us how much of a subsidy they
>>>>>>>>    provide, so it’s not going to add much.
>>>>>>>>
>>>>>>>>
>>>>>>>>    - Citations, descriptions of benefits, etc. for companies that
>>>>>>>>    need it
>>>>>>>>
>>>>>>>> There's still a lot we're not 100% sure about for employer
>>>>>>>> benefits, but we can do the best we have with what we've got, and make
>>>>>>>> changes as we get new information.
>>>>>>>>
>>>>>>>> My thinking was, we can use the "master employer list" to get as
>>>>>>>> much information about the companies we're interested in. When we're
>>>>>>>> satisfied that a row is finished and ready for publication, check the
>>>>>>>> checkbox in the "__publish" column. Then, when we export this data to the
>>>>>>>> website, we can only get the rows where "__publish" is checked. This
>>>>>>>> hopefully will ensure that someone manually reviewed and verified all the
>>>>>>>> data for an employer before it gets published, and that unfinished rows
>>>>>>>> won't be accidentally exported.
>>>>>>>>
>>>>>>>> Is this helpful? Am I just spinning my wheels in the mud?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 22, 2019 at 7:06 PM Stephen DeSanto <
>>>>>>>> rachidian at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> FYI, added most of you as editors on the spreadsheet I'm working
>>>>>>>>> on, in case anyone has time for tedious data tasks (or a quick way to do
>>>>>>>>> tedious data tasks). I'm currently adding in data from the TRU survey, from
>>>>>>>>> respondents whose employers offer transit benefits. Eventually, we'll need
>>>>>>>>> these tagged with industry and fix the neighborhoods data? And add in any
>>>>>>>>> other company data we have from the other research spreadsheets on the
>>>>>>>>> wiki? And eventually some subset of this data ends up on the website?
>>>>>>>>>
>>>>>>>>> On Tue, Aug 20, 2019 at 3:03 PM Tom Chartrand <
>>>>>>>>> tmchartrand at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Oh you're right, sorry for the confusion everyone! was just
>>>>>>>>>> fairly hidden in the view i looked at. Column S!
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 20, 2019 at 3:00 PM Katie Wilson <
>>>>>>>>>> katie at transitriders.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think the spreadsheet with PII removed still does include the
>>>>>>>>>>> Employer column, no?
>>>>>>>>>>>
>>>>>>>>>>> Sorry I’m being slow to respond to all this good stuff, I am
>>>>>>>>>>> still digging myself out from being away last week and I’m at an all-day
>>>>>>>>>>> thing today… but I should have time to pay more attention before the end of
>>>>>>>>>>> the week!
>>>>>>>>>>>
>>>>>>>>>>> On Aug 19, 2019, at 6:26 PM, Stephen DeSanto <
>>>>>>>>>>> rachidian at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I have time to go through the survey data and find the reported
>>>>>>>>>>> transit benefits per employer, though I'll need the data set that contains
>>>>>>>>>>> that data. :)
>>>>>>>>>>>
>>>>>>>>>>> Otherwise, I am going to be trying to match CTR neighborhoods to
>>>>>>>>>>> the employers already in our spreadsheet, as well as adding any employers
>>>>>>>>>>> mentioned in other sources/sheets on our wiki.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 19, 2019 at 6:02 PM Tom Chartrand <
>>>>>>>>>>> tmchartrand at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> This is looking great, Stephen!
>>>>>>>>>>>> I had put myself down to organize the survey data with respect
>>>>>>>>>>>> to employers for this, but I just realized that info was removed as PII (of
>>>>>>>>>>>> course)! So either Mike will need to take that on (I think Mike did the
>>>>>>>>>>>> original PII removal) or we'll need to figure out an appropriate way of
>>>>>>>>>>>> sharing that.
>>>>>>>>>>>> I'm feeling pretty swamped myself lately, so if you (Stephen)
>>>>>>>>>>>> were down to help him with the task that could be great. I can certainly
>>>>>>>>>>>> still take on some of it if needed though, once we get this sorted out.
>>>>>>>>>>>> Katie, maybe you could help coordinate this and make sure Mike
>>>>>>>>>>>> sees this sooner rather than later?
>>>>>>>>>>>>
>>>>>>>>>>>> Also, do let me know if you have any more specific spots in the
>>>>>>>>>>>> report where some backup from the PSRC dataset could be useful!
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Aug 18, 2019 at 3:59 PM Stephen DeSanto <
>>>>>>>>>>>> rachidian at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I've added the list of industry categories to the Google
>>>>>>>>>>>>> Sheet, so that should help validate the data we add there, though it's
>>>>>>>>>>>>> going to likely be a manual task to fill in industries for all the
>>>>>>>>>>>>> employers.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've also added a "citation" column, which can be the public
>>>>>>>>>>>>> representation of where we got the data to make our claim. We can fuss with
>>>>>>>>>>>>> the wording later.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I should have time this week to go through our survey data and
>>>>>>>>>>>>> other wiki tables to add or modify employers in the Google Sheet. Agree
>>>>>>>>>>>>> that it'll be good to have solid information on our primary targets and
>>>>>>>>>>>>> champions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 13, 2019 at 10:48 PM Harry Maher <
>>>>>>>>>>>>> harryb.maher at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just a quick update with regard to qualitative data analysis:
>>>>>>>>>>>>>> I made a "Commute Survey Qualitative Data Analysis" folder on pbworks and
>>>>>>>>>>>>>> put a doc with some quotes in it for the report. I tried to pull out the
>>>>>>>>>>>>>> main relevant themes that I noticed discussed in the two qualitative
>>>>>>>>>>>>>> questions currently in the file with a couple of quote options for each
>>>>>>>>>>>>>> theme/category of response to the question.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Harry
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Aug 11, 2019 at 12:54 PM Tom Chartrand <
>>>>>>>>>>>>>> tmchartrand at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regarding where to have this discussion - I'm just gonna
>>>>>>>>>>>>>>> continue the email chain cause I haven't followed where to put the
>>>>>>>>>>>>>>> discussion on the wiki, but someone feel free to steer it over there if we
>>>>>>>>>>>>>>> want to!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A brief update regarding establishing a larger list of
>>>>>>>>>>>>>>> employers to include in the dataset: basic contact information for all
>>>>>>>>>>>>>>> seattle businesses, sorted by the North American Industry Classification
>>>>>>>>>>>>>>> System, is available at
>>>>>>>>>>>>>>> https://web6.seattle.gov/fas/slimbizsearch/ResultsPage.aspx?NAICList=Top100,
>>>>>>>>>>>>>>> but it's a huge list of course, with no info on number of employees or
>>>>>>>>>>>>>>> revenue to filter out the smaller ones. Still, I did send off an email
>>>>>>>>>>>>>>> about getting a copy of the database just for purposes of cross-referencing
>>>>>>>>>>>>>>> names and such.
>>>>>>>>>>>>>>> On 8/10/19 6:42 PM, Katie Wilson wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For “neighborhood” I think it makes sense to use the “CTR
>>>>>>>>>>>>>>> Network Areas” as defined here
>>>>>>>>>>>>>>> <https://www.seattle.gov/transportation/projects-and-programs/programs/transportation-options-program/commute-trip-reduction-program/draft-2019-2023-networks-and-targets>
>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For “industry” I think it makes sense to use the “Employment
>>>>>>>>>>>>>>> Sector” categories listed on Page 12 of this CTR strategic plan.
>>>>>>>>>>>>>>> <https://www.seattle.gov/Documents/Departments/SDOT/TransportationOptionsProgram/CTR_Draft_Strategic_Plan_Jan2019.pdf>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On the ratings, I think it does make sense to lump "piggy
>>>>>>>>>>>>>>> bank" and "brown tortoise" in the same rating (0), and then add a tortoise
>>>>>>>>>>>>>>> badge for employers that aren’t even doing the pre-tax thing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another simplification option to consider would be to lump
>>>>>>>>>>>>>>> together 3 and 4 leaves. But let’s leave them separate for now and
>>>>>>>>>>>>>>> depending on how things shake out we can easily combine them later.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We don’t have any major sources of data on what benefits
>>>>>>>>>>>>>>> employers provide other than:
>>>>>>>>>>>>>>> — Metro public disclosure request spreadsheet
>>>>>>>>>>>>>>> <https://seattletransitpasses-research.pbworks.com/w/page/133438080/First%20Public%20Records%20Request>
>>>>>>>>>>>>>>> — Our commute survey
>>>>>>>>>>>>>>> — Info gleaned online from company websites, asking around,
>>>>>>>>>>>>>>> glassdoor etc (what I’ve found I’ve added to the relevant
>>>>>>>>>>>>>>> tables in the wiki
>>>>>>>>>>>>>>> <https://seattletransitpasses-research.pbworks.com/w/page/132177123/Employers>,
>>>>>>>>>>>>>>> on CTR employers and “potential poster children” and “likely target
>>>>>>>>>>>>>>> assessment” and “hotels”)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe it makes sense to have another string indicating
>>>>>>>>>>>>>>> sufficient certainty — when we have two sources, or one very reliable
>>>>>>>>>>>>>>> source, we enter an X or whatever, and that gives us the green light to
>>>>>>>>>>>>>>> display that data. Also it may not make sense to put a lot of work into
>>>>>>>>>>>>>>> categorizing employers into Network Area and Employment Sector until we
>>>>>>>>>>>>>>> have reliable data on what benefits they’re offering.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Speaking of Seattle Coffee Works, I spoke with their HR
>>>>>>>>>>>>>>> person a few months ago and actually employees have to pay $20/month
>>>>>>>>>>>>>>> (pre-tax $) if they want an ORCA card. Still a great deal but not 100%
>>>>>>>>>>>>>>> subsidy as reported in the Metro data— which, I then learned, is
>>>>>>>>>>>>>>> self-reported by the company. Metro only knows that all those companies are
>>>>>>>>>>>>>>> signed up for the Passport program. I noted the real situation on
>>>>>>>>>>>>>>> this page
>>>>>>>>>>>>>>> <https://seattletransitpasses-research.pbworks.com/w/page/133439169/Potential%20Poster%20Children>.
>>>>>>>>>>>>>>> Anyway, the point is we should probably crosscheck the Metro data as much
>>>>>>>>>>>>>>> as we can with our survey or other sources of information.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (Also speaking of Seattle Coffee Works they have locations
>>>>>>>>>>>>>>> in Capitol Hill & Cascade too
>>>>>>>>>>>>>>> <https://www.seattlecoffeeworks.com/our-cafes.aspx>. From
>>>>>>>>>>>>>>> talking with the HR person I’m pretty sure all are include in their
>>>>>>>>>>>>>>> passport program, and the employees swap around a lot from location to
>>>>>>>>>>>>>>> location. They probably use the Ballard location as home base for transit
>>>>>>>>>>>>>>> pass purposes since that’s the least expensive zone.)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One project would be to come up with a list of employers
>>>>>>>>>>>>>>> that have name recognition (or that we are interested in for some other
>>>>>>>>>>>>>>> reason) and put a little work into attaining sufficient certainty. If we
>>>>>>>>>>>>>>> posted the list to a page and put a call out on social media and email I
>>>>>>>>>>>>>>> bet we’d get some answers.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Aug 8, 2019, at 5:26 PM, Stephen DeSanto <
>>>>>>>>>>>>>>> rachidian at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've taken a first pass at the data schema for showing
>>>>>>>>>>>>>>> employer transit benefits in our upcoming web app. In this draft, each
>>>>>>>>>>>>>>> employer record is represented as follows:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>     "employer": string,
>>>>>>>>>>>>>>>     "industry": [string],
>>>>>>>>>>>>>>>     "neighborhood": [string],
>>>>>>>>>>>>>>>     "alias": [string],
>>>>>>>>>>>>>>>     "rating": int,
>>>>>>>>>>>>>>>     "description": string
>>>>>>>>>>>>>>>     "badges": [string]
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Employer* is a plain text string.
>>>>>>>>>>>>>>> *Industry* is a list of strings (or a single string, if we
>>>>>>>>>>>>>>> want to limit one employer = one industry).
>>>>>>>>>>>>>>> *Neighborhood* is treated similarly to industry
>>>>>>>>>>>>>>> *Alias* is a list of other names for the same company. For
>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>> *Rating* is a numerical scale that represents the "worker's
>>>>>>>>>>>>>>> monthly cost of an unlimited transit pass". The scale provided during the
>>>>>>>>>>>>>>> meeting went from "4 leaves" to "brown tortoise"; aligning to the leaves,
>>>>>>>>>>>>>>> that gives us a scale of [-1, 0, 1, 2, 3, 4]. We could adjust this up to
>>>>>>>>>>>>>>> 0-5, or lump "piggy bank" and "brown tortoise" in the same rating.
>>>>>>>>>>>>>>> *Description* is a string that describes the employer's
>>>>>>>>>>>>>>> transit benefits, i.e. why they got the rating they did.
>>>>>>>>>>>>>>> *Badges* is a list of strings that represent any additional
>>>>>>>>>>>>>>> categories we want to assign to a company (e.g. "industry leader",
>>>>>>>>>>>>>>> "polluter").
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We can make changes to this schema if it makes it easier to
>>>>>>>>>>>>>>> work with our underlying data visualization platform (e.g. Tableau?
>>>>>>>>>>>>>>> DataTables?), but hopefully this is a suitable starting place.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As an example, take a hypothetical record for Seattle Coffee
>>>>>>>>>>>>>>> Works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>     "employer": "Seattle Coffee Works",
>>>>>>>>>>>>>>>     "industry": ["restaurant"],
>>>>>>>>>>>>>>>     "neighborhood": ["cbd", "ballard"],
>>>>>>>>>>>>>>>     "alias": ["Ballard Coffee Works"],
>>>>>>>>>>>>>>>     "rating": 4,
>>>>>>>>>>>>>>>     "description": "Provides 100% ORCA Passport subsidy."
>>>>>>>>>>>>>>>     "badges": ["leader"]
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Where Our Data Lives (For Now)*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've also taken a rough chop at getting started with the
>>>>>>>>>>>>>>> data. Here, I've just taken the raw list of ORCA Business Passport
>>>>>>>>>>>>>>> employers and assigned a score based on their subsidy percentage, as an
>>>>>>>>>>>>>>> example:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://docs.google.com/spreadsheets/d/1HmOcG7hJLD1G0unCMPcsDnXr4RIA_PMKEE5ne-hhQR8/edit?usp=sharing
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The spreadsheet contains columns for each item of the
>>>>>>>>>>>>>>> employer record, as well as some additional columns to record the raw data
>>>>>>>>>>>>>>> we have on file for that employer, so we can use that data to automatically
>>>>>>>>>>>>>>> or manually determine an employer's rating.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we have data from other sources not listed (e.g. survey
>>>>>>>>>>>>>>> data, City of Seattle data), the "source_" columns can be renamed or added
>>>>>>>>>>>>>>> to represent that source's data. For example, if I want to add data from
>>>>>>>>>>>>>>> the TRU survey, I might rename "__source_b" to "__TRU Survey", then include
>>>>>>>>>>>>>>> results from that survey in that column for each company. (The columns
>>>>>>>>>>>>>>> beginning with two underscores are ones I don't expect to be publicly
>>>>>>>>>>>>>>> available.)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> PBworks feels really inadequate for editing large data sets,
>>>>>>>>>>>>>>> and I don't know where else to put it, so it's living in Google Sheets for
>>>>>>>>>>>>>>> now. Set to read-only with the link, for now, but please request editing
>>>>>>>>>>>>>>> permissions so you can add stuff to the sheet.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently, my expectation is that the spreadsheet will be
>>>>>>>>>>>>>>> hand-edited in Google Sheets, and then when we're ready to put live data in
>>>>>>>>>>>>>>> the web app, we can export the sheet to a flat file, which we can then
>>>>>>>>>>>>>>> import into a format appropriate for the website (big ol' JSON file,
>>>>>>>>>>>>>>> database, whatever). Manual process, but probably fine for a project of
>>>>>>>>>>>>>>> this scale; I'm open to alternatives.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Things To Do Next*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Aside from the ORCA Passport data and the data we collected
>>>>>>>>>>>>>>> through TRU survey / legwork (on PBworks), do we have any other data
>>>>>>>>>>>>>>> sources that would provide context for a score?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the data sources we have, we'll have to start filling
>>>>>>>>>>>>>>> out the rest of the spreadsheet, I guess?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, we will need to determine:
>>>>>>>>>>>>>>> a) master list of "industries" we want to support, and
>>>>>>>>>>>>>>> b) "industry" field(s) for each employer
>>>>>>>>>>>>>>> c) "neighborhood" field(s) for each employer we don't have
>>>>>>>>>>>>>>> one for (or being more precise than what I have now)
>>>>>>>>>>>>>>> d) which companies get tagged with which badges
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hope that helps.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In solidarity,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Stephen
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.transitriders.org/pipermail/research/attachments/20190908/b26a7272/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: embed_orca_viz_3_html
Type: application/octet-stream
Size: 1662 bytes
Desc: not available
URL: <http://lists.transitriders.org/pipermail/research/attachments/20190908/b26a7272/attachment.obj>


More information about the Research mailing list