Material Big Data

Lanzados ppts informativos de tecnologías BigData: Hadoop, Hbase, Hive, Zookeeper...

Te presentamos la mejor plataforma de Planificación y Presupuestacion BI

Forecasts, Web and excel-like interface, Mobile Apps, Qlikview, SAP and Salesforce Integration...

Pentaho Analytics. Un gran salto

Ya se ha lanzado Pentaho 7 y con grandes sorpresas. Descubre con nosotros las mejoras de la mejor suite Open BI

La mejor oferta de Cusos Open Source

Después de la gran acogida de nuestros Cursos Open Source, eminentemente prácticos, lanzamos las convocatorias de 2016

28 feb. 2017

Machine Learning: Choosing the right estimator



Often the hardest part of solving a machine learning problem can be finding the right estimator for the job.
Different estimators are better suited for different types of data and different problems.


The flowchart below by Scikit Learn is designed to give users a bit of a rough guide on how to approach problems with regard to which estimators to try on your data

24 feb. 2017

Leaflet and R


Leaflet 1.1.0 is now available on CRAN! The Leaflet package is a tidy wrapper for the Leaflet.js mapping library, and makes it incredibly easy to generate interactive maps based on spatial data you have in R.




Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. It’s used by websites ranging from The New York Times and The Washington Post to GitHub and Flickr, as well as GIS specialists like OpenStreetMap, Mapbox, and CartoDB

This release was nearly a year in the making, and includes many important new features.
  • Easily add textual labels on markers, polygons, etc., either on hover or statically
  • Highlight polygons, lines, circles, and rectangles on hover
  • Markers can now be configured with a variety of colors and icons, via integration with Leaflet.awesome-markers
  • Built-in support for many types of objects from sf, a new way of representing spatial data in R (all basic sf/sfc/sfg types except MULTIPOINT and GEOMETRYCOLLECTION are directly supported)
  • Projections other than Web Mercator are now supported via Proj4Leaflet
  • Color palette functions now natively support viridis palettes; use "viridis", "magma", "inferno", or "plasma" as the palette argument
  • Discrete color palette functions (colorBin, colorQuantile, and colorFactor) work much better with color brewer palettes
  • Integration with several Leaflet.js utility plugins
  • Data with NA points or zero rows no longer causes errors
  • Support for linked brushing and filtering, via Crosstalk (more about this to come in another blog post)
Visto en blog.rstudio

23 feb. 2017

Citus 6.1 Released, escala tu Base de datos PostgreSQL


Interesantes novedades de Citusdata, ver Community Edition

Citus es una base de datos distribuida que permite escalar PostgreSQL (una de nuestras Bases de Datos favoritas), permitiendo usar todas las funcionalidades de PostgreSQL con las ventajas de escalar.

Microservices and NoSQL get a lot of hype, but in many cases what you really want is a relational database that simply works, and can easily scale as your application data grows. Microservices can help you split up areas of concern, but also introduce complexity and often heavy engineering work to migrate to them. Yet, there are a lot of monolithic apps out that do need to scale. 

If you don’t want the added complexity of microservices, but do need to continue scaling your relational database then you can with Citus. With Citus 6.1 we’re continuing to make scaling out your database even easier with all the benefits of Postgres (SQL, JSONB, PostGIS, indexes, etc.) still packed in there.

With this new release customers like Heap and Convertflow are able to scale from single node Postgres to horizontal linear scale. Citus 6.1 brings several improvements, making scaling your multi-tenant app even easier. These include:
  • Integrated reference table support
  • Tenant Isolation
  • View support on distributed tables
  • Distributed Vaccum / Analyze

All of this with the same language bindings, clients, drivers, libraries (like ActiveRecord) that Postgres already works with

21 feb. 2017

How to create your own Dashboards in Pentaho?



Just a sneak preview of new functionalities we are including in Pentaho in order end user can create their own powerful dashboards in minutes. We call it STDashboard, by our colleagues of Stratebi.

These new functionalities include: new templates, panel resize, drag and drop, remove and create panels, Pentaho 7 upgrade...

As always and as other Pentaho plugins, is free and included in all of our projects. Check the DemoPentaho Online, where all new components are updated frequently

You can use it too, directly in your own projects, including configuration, training and support with our help


Video in action (Dashboards in minutes):

15 feb. 2017

Glosario de Terminos de Business Intelligence


Para todos aquellos que se están introduciendo en el mundo del Business Intelligence, os incluimos un Glosario de los principales términos de Business Intelligence. Visto en el blog de Panorama

Si queréis jugar con una Demo abierta, open source, para conocer y probar estos conceptos, es lo mejor para familiarizarse.

Glosario de Términos Business Intelligence:

  • Automated Analysis: Automatic analysis of data to find hidden insights in the data and show users the answers to questions they have not even thought of yet.
  • BI Analyst: As stated by modernanalyst.com, a data analyst is a professional who is in charge of analyzing and mining data to identify patterns and correlations, mapping and tracing data from system to system in order to solve a problem, using BI and data discovery tools to help business executives in their decision making, and perform statistical analysis of business data, among other things. (Can be called a data analyst too)
  • BI Governance: According to Boris Evelson, from Forrester Research, BI governance is a key part of data governance, but if focuses on a BI system and governs over who uses the data, when, and how.
  • Big Data: Enormous and complex data sets that traditional data processing tools cannot deal with.
  • Bottlenecks: Points of congestion or blockage that hinder the efficiency of the BI system.
  • Business Intelligence: According to Gartner, “Business Intelligence is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.”
  • Centralized Business Intelligence: A BI model that enables users to work connected and share insights, while seeing the same and only version of the truth. IT governs over data permissions to ensure data security.
  • Collaborative BI: An approach to Business Intelligence where the BI tool empowers users to collaborate between colleagues, share insights, and drive collective knowledge to improve decision making.
  • Collective Knowledge: Knowledge that benefits the whole enterprise as it comes from the sharing of insights and data findings across groups and departments to enrich analysis.
  • Dark Data: According to Gartner, the definition for Dark Data is “information assets that organizations collect, process and store in the course of their regular business activity, but generally fail to use for other purposes”. 90% of companies’ data is dark data.
  • Dashboards: A data visualization tool that displays the current enterprise health, the status of metric and KPIs, and the current data analysis and insights.
  • Data Analyst: As stated by modernanalyst.com, a data analyst is a professional who is in charge of analyzing and mining data to identify patterns and correlations, mapping and tracing data from system to system in order to solve a problem, using BI and data discovery tools to help business executives in their decision making, and perform statistical analysis of business data, among other things.
  • Data Analytics: According to TechTarget, “data analytics is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software.”
  • Data Governance: According to Boris Evelson, from Forrester Research, data governance “deals with the entire spectrum (creation, transformation, ownership, etc.) of people, processes, policies, and technologies that manage and govern an enterprise’s use of its data assets (such as data governance stewardship applications, master data management, metadata management, and data quality).
  • Data Mashup: An integration multiple data sets in a unified analytical and visual representation.
  • Data Silos: According to Tech Target, a data silo is “data that is under the control of one department or person and is isolated from the rest of the organization.” Data silos are a bottleneck for effective business operations.
  • Data Sources: The source where the data to be analyzed comes from. It can be a file, a database, a dataset, etc. Modern BI solutions like Necto can mashup data from multiple data sources.
  • Data Visualization: The graphic visualization of data. Can include traditional forms like graphs and charts, and modern forms like infographics.
  • Data Warehouse: A relational database that integrates data from multiple sources within a company.
  • Embedded Analytics: The integration of reporting and data analytic capabilities in a BI solution. Users can access full data analysis capabilities without having to leave their BI platform.
  • Excel Hell: A situation where the enterprise is full of unnecessary copies of data, thousands of spreadsheets get shared, and no one knows with certainty which is the most updated and real version of the data.
  • Federated Business Intelligence: A BI model where users work in separate desktops, creating data silos and unnecessary copies of data, leading to multiple versions of the truth.
  • Geo-analytic capabilities: The ability that a BI or data discovery tool has to analyze data by geographical area and reflect such analysis on maps on the user’s dashboard.
  • Infographics: Visual representations of data that are easily understandable and drive engagement.
  • Insights: According to Forrester Research, insights are “actionable knowledge in the context of a process or decision.”
  • KPI: Key Performance Indicator. A quantifiable measure that a business uses to determine how well it meets the set operational and strategic goals. KPIs give managers insights of what is happening at any specific moment and allow them to see in what direction things are going.
  • Modern BI: An approach to BI using state of the art technology, providing a centralized and secure platform where business users can enjoy self-service capabilities and IT can govern over data security.
  • OLAP: Stands for Online Analytical Processing and it is a technology for data discovery invented by Panorama Software and then sold to Microsoft in 1996. It has many capabilities, such as complex analytics, predictive “what if” scenario planning, and limitless report viewing.
  • Scalability: The ability of a BI solution to be used by a larger number of users as time passes.
  • Self-Service BI: An approach that allows business users to access and work with data sources even though they do not have an analyst or computer science background. They can access, profile, prepare, integrate, curate, model, and enrich data for analysis and consumption by BI platforms. In order to have successful self-service BI, the BI tool must be centralized and governed by IT.
  • Smart Data: Smaller data sets from Big Data that are valuable to the enterprise and can be turned into actionable data.
  • Smart Data Discovery: The processing and analysis of Smart Data to discover insights that can be turned into actions to make data-driven decisions in an organization.
  • Social BI: An approach where social media capabilities, such as social networking, crowdsourcing, and thread-based discussions are embedded into Business Intelligence so that users can communicate and share insights.
  • Social Enterprise: An enterprise that has a new level of corporate connectivity, leveraging the social grid to share and collaborate on information and ideas. It drives a more efficient operation where problems are uncovered and fixed before they can affect the revenue streams.
  • SQL: Stands for Standardized Query Language. It is a language used in programming for managing relational databases and data manipulation.
  • State of the Art BI: The highest level of technology, the most up-to date features, and the best analysis capabilities in a Business Intelligence solution.
  • Suggestive Discovery Engine: An engine behind the program that recommends to the users the most relevant insights to focus on, based on personal preferences and behavior.
  • Systems of Insight: This is a term coined by Boris Evelson, VP of Forrester Research. It is a Business Intelligence system that combines data availability with business agility, where both IT and business users work together to achieve their goals.
  • Workboards: An interactive data visualization tool. It is like a dashboard that displays the current status of KPIs and other data analysis, with the possibility to work directly on it and do further analysis.

4 feb. 2017

LinceBI, the best Analytics/BigData open source based solution!!

As powerful as an enterprise version, with the advantages of being Open Source based. Discover LinceBI, the most complete Bussines Intelligence platform including all the functionalities you need


Dashboards
  • User friendly, templates and wizard
  • Technical skills is not mandatory
  • Link to external content
  • Browse and navigate on cascade dependency graphs
Analytic Reporting
  • PC, Tablet, Smartphone compatibility
  • Syncs your analysis with other users
  • Download information on your device
  • Make better decisions anywhere and anytime
Bursting
  • Different output formats (CSV, Excel, PDF, HTML)
  • Task scheduling to automatic execution
  • Mailing
Balance Scorecard
  • Assign customized weights to your kpis
  • Edit your data on fly or upload an excel template
  • Follow your key performance indicators
  • Visual kpis, traffic lights colours
  • Assign color coding to your threshold
  • Define your own key performance indicators
Accessibility
  • Make calculated fields on the fly
  • Explore your data on chart
  • Drill down and roll up capabilities
  • What if analysis and mailing

Adhoc Reporting
  • Build your reports easily, drag and drop
  • Models and languaje created to Business Users
  • Corporative templates to your company
  • Advanced filters
Alerts
  • Configure your threshold
  • Mapping alerts and business rules
  • Planning actions when an event happen
Check FAQs section for any question